As before, I am blogging from Strata NYC, and am linking to 0xdata’s website:
As before, I am blogging from Strata NYC, and am linking to 0xdata’s website:
I’m trying something new today – please find this blog hosted at my new company:
Been to long between blogs…
“TCP Is Not Reliable” – what’s THAT mean?
Means: I can cause TCP to reliably fail in under 5 mins, on at least 2 different modern Linux variants and on modern hardware, both in our datacenter (no hypervisor) and on EC2.
What does “fail” mean? Means the client will open a socket to the server, write a bunch of stuff and close the socket – with no errors of any sort. All standard blocking calls. The server will get no information of any sort that a connection was attempted. Let me repeat that: neither client nor server get ANY errors of any kind, the client gets told he opened/wrote/closed a connection, and the server gets no connection attempt, nor any data, nor any errors. It’s exactly “as if” the client’s open/write/close was thrown in the bit-bucket.
We’d been having these rare failures under heavy load where it was looking like a dropped RPC call. H2O has it’s own RPC mechanism, built over the RUDP layer (see all the task-tracking code in the H2ONode class). Integrating the two layers gives a lot of savings in network traffic, most small-data remote calls (e.g. nearly all the control logic) require exactly 1 UDP packet to start the call, and 1 UDP packet with response. For large-data calls (i.e., moving a 4Meg “chunk” of data between nodes) we use TCP – mostly for it’s flow-control & congestion-control. Since TCP is also reliable, we bypassed the Reliability part of the RUDP. If you look in the code, the AutoBuffer class lazily decides between UDP or TCP send styles, based on the amount of data to send. The TCP stuff used to just open a socket, send the data & close.
So as I was saying, we’d have these rare failures under heavy load that looked like a dropped TCP connection (was hitting the same asserts as dropping a UDP packet, except we had dropped-UDP-packet recovery code in there and working forever). Finally Kevin, our systems hacker, got a reliable setup (reliably failing?) – it was a H2O parse of a large CSV dataset into a 5-node cluster… then a 4-node cluster, then a 3-node cluster. I kept adding asserts, and he kept shrinking the test setup, but still nothing seemed obvious – except that obviously during the parse we’d inhale a lot of data, ship it around our 3-node clusters with lots of TCP connections, and then *bang*, an assert would trip about missing some data.
Occam’s Razor dictated we look at the layers below the Java code – the JVM, the native, the OS layers – but these are typically very opaque. The network packets, however, are easily visible with wireshark tools. So we logged every packet. It took another few days of hard work, but Kevin triumphantly presented me with a wireshark log bracketing the Java failure… and there it was in the log: a broken TCP connection. We stared harder.
In all these failures the common theme is that the receiver is very heavily loaded, with many hundreds of short-lived TCP connections being opened/read/closed every second from many other machines. The sender sends a ‘SYN’ packet, requesting a connection. The sender (optimistically) sends 1 data packet; optimistic because the receiver has yet to acknowledge the SYN packet. The receiver, being much overloaded, is very slow. Eventually the receiver returns a ‘SYN-ACK’ packet, acknowledging both the open and the data packet. At this point the receiver’s JVM has not been told about the open connection; this work is all opening at the OS layer alone. The sender, being done, sends a ‘FIN’ which it does NOT wait for acknowledgement (all data has already been acknowledged). The receiver, being heavily overloaded, eventually times-out internally (probably waiting for the JVM to accept the open-call, and the JVM being overloaded is too slow to get around to it) – and sends a RST (reset) packet back…. wiping out the connection and the data. The sender, however, has moved on – it already sent a FIN & closed the socket, so the RST is for a closed connection. Net result: sender sent, but the receiver reset the connection without informing either the JVM process or the sender.
Kevin crawled the Linux kernel code, looking at places where connections get reset. There are too many to tell which exact path we triggered, but it is *possible* (not confirmed) that Linux decided it was the subject of a DDOS attack and started closing open-but-not-accepted TCP connections. There are knobs in Linux you can tweak here, and we did – and could make the problem go away, or be much harder to reproduce.
With the bug root-caused in the OS, we started looking our options for fixing the situation. Asking our clients to either upgrade their kernels, or use kernel-level network tweaks was not in the cards. We ended up implementing two fixes: (1) we moved the TCP connection parts into the existing Reliability layer built over UDP. Basically, we have an application-level timeout and acknowledgement for TCP connections, and will retry TCP connections as needed. With this in place, the H2O crash goes away (although if the code triggers, we log it and use app-level congestion delay logic). And (2) we multiplex our TCP connections, so the rate of “open TCPs/sec” has dropped to 1 or 2 – and with this 2nd fix in place we never see the first issue.
At this point H2O’s RPC calls are rock-solid, even under extreme loads.
Found this decent article: http://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable
It’s another long drive day for us, we’re trying to get from Stone Mountain (near Atlanta) to Harrisburg, PA today – and Chaplain CT sometime tomorrow. We’re quite expert and breaking camp by now; it takes maybe an hour to pull up all the sleeping bags and fold all the couches and tables back out, to shower and freshen up, to reload fresh water tanks and dump the other tanks. We spend another hour in a local Walmart replacing basic supplies and then we’re on the road.
The kids have figured out how to keep themselves busy on the drive. We’ve got a TV and a Wii, and some amount of reading. There’s singing and tickle fights, and lots of napping. There’s food-making and grumbling about dish cleanup. We camp out in the middle of Pennsylvania. We pass the 3500 miles traveled mark, the 1/2-way point.
We break camp at daylight without waking the kids, and drive maybe two hours before the kids bother to roll out of bed. RV “camping” is a real trick. We make it around New York with only 1 truly crazy driver incident; a bright red pickup truck came blazing up the left side and was clearly out of room to pass us, but did so anyways. He sliced across at a 45-degree angle in front of us. Had I not slammed the brakes and swerved we clearly would have hit the truck; and such a hit would have rolled him.
We finally pull into my Uncle Bill’s farm in Connecticut around 4pm. We settle the RV, then meander down to the river behind the farm, where one of my cousins is RV camping. We swim in the river, cook burgers on the campfire and sit around visit until way past dark.
We hang out in the farm all day; some of the kids swim in the river or fish or shoot fireworks off after dark. I mostly hung out and caught up with the family news. Shelley & I attended the local church wine-tasting, which was basically a chance to drink a bunch of wines that somebody else bought, and do more catching up on family news.
Shelley & I borrow a cousin’s car and drive to Cape Cod for the day. OMG’s a car is SO much nicer to handle than Nessie! We take the slow route up the Cape stopping at every tiny town and inlet. Shelley’s family owned a summer house in Dennis Port 50 or 60 years ago and Shelley was tracing her roots. We managed to stick our toes in the Atlantic and really unwind. Shelley & I both like driving, so it’s another really peaceful down day.
Up early, we force all the kids to take showers (and change clothes; 2 weeks into vacation and our standards are getting pretty lax) and we hit the road. Breaking camp is now a pretty standard operation. By rotating drivers and Shelley driving until the wee hours we make it almost to Indiana.
We pull into the University of Illinois at Urbana-Champaign around noon. I’m giving at talk at 6, and UofI is paying for dinner and 3(!) hotel rooms for us (one for each couple, and one more for the 3 kids). Real showers for all again! Yeah!!! The talk goes really well, its my Debugging Data Races talk and its a good fit for the summer course on multi-core programming. Shelley and I manage to sneak a beer afterwards.
Again we manage to break camp in short order and do another long day of driving through Illinois, Iowa, and Nebraska. By now we’ve got a rhythm going; Shelley takes the early morning driving shift while everybody sleeps in, then Luke and I alternate shifts until evening (while Shelley naps), and Shelley takes the late night shift. I think we’re covering around 800 miles in a day.
Today it’s the rest of Nebraska and Wyoming, then Utah. My Dad manages to call me out in the middle of I-80 no-where land, to the bemusement of all. We hit high winds on and off all day. At least once I was driving with the steering wheel cranked over a full 180 degrees (and was down to 45 mph) just to stay on the road. 18-wheeler’s would blow by us, knocking us all over the road. First the bow wave push us hard to the right, on to the shoulder. Then the wind-block (and my 180-degree wheel position) would drive us hard back onto the road and into the truck, then the trailing suction would pull us harder into the truck – even as I am cranking the wheel the other way as fast as I can… and then the wind would return. It was a nerve-wracking drive. Shelley took over towards evening. Around 11pm the winds became just undrivable even for her. I was dozing when suddenly we got slapped hard over, almost off the shoulder. Even driving at 40mph wasn’t safe. An exit appeared in the middle of nowhere – even with an RV park (mind you, it’s typically 30 miles between exits *without services*). We bailed out. All night long the RV was rocked by winds, like a Giant’s Hand was grabbing the top of Nessie and shaking her like a terrier does a rat.
Morning dawns clear and still. We hit the road again early, as we’ve a long drive today. It’s a quiet drive through to Reno, and then we hit some really crazy drivers again – a combo of construction zone, short merge lanes and stupidity (outside the RV) nearly crushed a darting micro-car. The construction on the Donner Pass was perhaps even worse; we managed to get forced into clipping a roadside reflector on the right (less than a foot away from the mountain stone versus pushing an aggressive SUV into the naked concrete on his left). Finally past all the madness we get to the clear road down from Tahoe and through the Bay Area – but it’s all Homeward Bound on the downhill slide through our home turf!
Home At Last!!!
Some parting stats:
We passed through 22 states (24 for Shelley & I, as we also get to count Rhode Island and Massachusetts).
We drove about 6900 miles.
I bought about $3000 in gas, and $1300 in tires.
We saw 4 close family members in Tucson, 7 in Texas, my brother in Atlanta, and at least 16 in Connecticut (I lost the exact count!).
I did about 20 loads of laundry after returning (the washer ran continuously for 2 days).
We’re on the road by 10am, this time a full day’s drive to Montgomery AL from Katy TX. I forget how big Houston freeways are; at one point I count 9 lanes *in each direction* (18 total lanes!). I’ve never seen so much concrete. It’s otherwise mostly uneventful, though. Traffic is fair to light and the road is good. We stop at a random lakeside park by Lake Charles for lunch. It smells of the ocean and has an alligator pond/cage/viewing area.
While I typically encourage the kids to drink a lot (to survive the desert heat & dry), I don’t check on how much they eat, just that they eat a reasonably balanced diet. So I missed out that Matt hadn’t eaten all day, and was constantly staring heads-down on his IPod on silly flash games. Well, towards afternoon he starts feeling sick, and near dinner he barfs and refuses to eat or drink anything. He cannot even keep down tiny bits of bread or Gatoraide; the 2nd barf happens on our bedsheets and pillow. At this point he decides to camp out by the RV toilet and do any more barfing into that (uggh!!! poor guy!!!!), and we decide to cut it short and look for camping for the night. By dinner he’s still unable to keep anything down; we grab to-go food from a collection of fast-food joints and keep rolling to the nearest campsite.
We get 1/2 way between Mobile & Montgomery, AL and pull over into a nice full-service RV park. Shelley & I decide to camp outside in a tent, so Josh can get off the floor (he’s 17 and 6ft tall, lean and flexible… and does not fit in any of the RV pullout/fold-down beds, so he’s been sleeping in the aisle). We want Josh off the floor so Matt can make an emergency run from his foldout bed to the bathroom without interference. It’s beastly hot and humid outside, but I figure it will cool off as the night wears on. Boy was I wrong! It remains 80+ & 80% humidity all night long outside, while the kids were sleeping in air-conditioned luxury. And we get a late night visit from the camp kitten – he’s adorably cute and caterwauls at us, and starts climbing the tent with his razor claws until Shelley takes him for a walk. He follows her like a shadow all over the park until she finally has to lock him in the campground bathroom.
Finally dawn breaks and we move back into the cool RV air. Ahhh, blessed relief. Also, Matt is much better – it’s a common kid 24-hour tummy bug. I start him back in on the BRAT diet, with sips of water – and now he’s very hungry, a good sign. He continues to improve throughout the day and is eating normal by dinner. We pull up camp (we’re getting quite expert at this) and head for Stone Mountain, GA.
Stone Mountain is a giant mountain-sized chunk of granite outside of Atlanta, with a park and a lake. It’s been carved with a 50ft high sculpture and has been slowly improved over the years to include many hiking trails, a sky tram system, lots of outdoor adventure activities and an amusement park. Apparently the “ducks” (amphibious vehicals) are fantastic. We are going there for the July 4th extravaganza – and as a sign that I’m on vacation, I barely know that today is the 3rd and I’ve no idea what day of the week it is. We get there about 3pm and check in to a nice RV camp site.
Shelley cooks a fantastic spaghetti dinner. My brother Eric drives out to camp with us, bring his best friends’ two small girls (ages 6 & 7) with him (he’s been watching the girls when the parents are working since they were 2 & 3) and we all enjoy a nice picnic dinner. As the evening rolls on we’re deciding on whether or not to see the laser & fireworks show this evening (there’s a bigger one tomorrow) – when the thunderstorm hits. It’s a real downpour, big lightning and thunder, blowing wind, the works. We wait that out, and then try to take a walk about the park. Eric & I, the two girls and my middle two kids walk over to the clubhouse (to check out the water-taxi ride to the main park area) but the rain has other ideas. We make it to the clubhouse but we’re fairly wet, so we treat the girls to hot chocolate while we dry out. We wait for the rains to end but it’s no good – the rain has turned into a steady drizzle; we just as wet by the time we make it back and there’s no end in sight. We give up any idea of tent camping or seeing the laser show and settle for watching a Disney movie (the Sword in the Stone) and having a lazy evening with all 10 of us huddled in the RV). Sleeping arrangements are “cozy” to say the least! But at least everybody is dry.
It’s the 4th of July! We breakfast, cleanup & head over to the water taxi. The rains have stopped and the sun is out. It’s gonna be a hot & humid day. The water taxi is nice, it’s cooler near the lake. We make it to Stone Mountain’s main attraction area and decide to walk to the bell tower. The park is already busier than Eric has ever seen it before. There’s a large Indian family setup under the bells already (and I see more people of the same persuasion walking over to the tower all morning – I think they figured out a cool shady semi-private place to hang out at all day).
We’ve walked maybe a half a mile and it’s not even noon and we’re already soaked with sweat when we make it back to the Plantation Inn. The Inn isn’t open for lunch (although the AC is nice), but the helpful counter lady tells us there’s RV parking closer in. We walk up to Memorial Hall. Immediately two things strike me as really odd: there’s at least 1000 people hanging around looking for food (and more pouring in all the time), it’s 11:30 and *none* of the dozen or so restaurants are open yet – and there’s bus & RV parking open
right in front of the main Hall.
I hand the kids my credit card (to get lunch at noon when the restaurants open) and Shelley and I hightail it back to the RV: across 1/2mile of hot trails & roads, ride the water taxi (we miss the one in front of us by literally seconds even with me sprinting across the landing area), and finally the 1/4 mile hike from the taxi dock to the RV. We pull the hookups as fast as we can and roll out & down the road. Nessie does NOT sprint, she *proceeds*, but we made her proceed as fast as possible. We took the short way around the lake, only to discover the road was closed: the attendant at the barricades explains “the road fell in a hole”. Nothing to be had for it; Shelley makes a 3-pt turn on a narrow park road and we go the long way around. Finally, a full hour later, we make it back to the bus/RV parking in front of Memorial Hall – and Lo! it’s open. We take the most premier parking spot in all of Stone Mountain, at noon-thirty on the 4th of July. (A short time later one other RV takes the next spot, then the road is closed behind us).
The amazing thing about the Stone Mountain concessions was the astronomical price for food; hotdogs: $7-$10, drinks also $7 or so. (And they denied a hot and hungry hoard for at least an hour???). But finally we all sat down and finished our food and plotted our next move. Shelley, Eric & I all want a big hike. Last Christmas Shelley & I hiked the Grand Canyon down to Phantom Ranch and back out in two days, and Eric has hiked both the Pacific Crest Trail and the Appalachian Trail end-to-end. We head out for the top of Stone Mountain on a hot & muggy day. There’s lots of other folks with the same idea, but it really is a long hot hike. Most of my kids bail out after a mile or so, voting to go hang out in the AC (which is really a good plan); Eric and his two young charges make it to the path-up cutoff but it’s a killer hike in the heat so they turn around also.
It ends up as Shelley, Laura (age 15) and I heading on, and we decide to head for the bird sanctuary. It’s another couple of miles and we gave most of the water to Eric & the girls. The three of us head down the far side of the mountain to a kids playground and finally drag ourselves into the park and help ourselves to the water fountain. We drink a quart each, and fill a couple more quart bottles we’re carrying. We hike the 1/2mile more to the bird sanctuary – mostly carrying on now because of what Shelley would call “Mission” – her ex-Marine training to “complete the Mission” no matter the cost. i.e., we’re all too collectively silly to claim the end goal is ridiculous, so we hike it anyways. It’s a decent enough little woodsy trail, with plenty of songbirds – but far to beastly hot to really enjoy. By the time we make it back to the kids’ park we’ve drunken all our water (another 1/2gal between the 3 of us), so we reload (and re-drink our fill) and back up the mountain to cross it in reverse. We make it back in good time, although it was really pushing our the limits to
hike so far on such a hot day.
There is much lounging around and napping in the RV’s AC to wait out the heat of the day. Matthew (age 12) introduces the two little girls to the joys of Minecraft. Eric & Laura nap. Everybody else surfs the (very very slow) park Internet, eating popcorn & chips. Finally as the heat starts to fade and twilight sets in we get enough gumption to make & eat hotdogs. Then we pack it up and prepare to leave the relative safety and peace of the RV for the slowly building hoard.
The lawn below Memorial Hall faces the giant sculpture carved into the face of Stone Mountain. The only open spaces are at the very front, so that’s where we head. I estimate 100,000 people eventually filled that lawn; in any case it was a colossal crowd. It was also actually quite a peaceful crowd; no rowdies (no alcohol allowed), zillions of little kids running pell-mell, picnic blankets, soap bubble makers and glowing flashing LED lights. It’s cooler now, so we settle down on our blankets and chairs, listen to the music and wait for the show. At various times I let Josh or Karen & Luke wander off for snacks (a little nerve-wracking that; they are out of sight in the crowd within seconds and gone for 30mins or more, but everybody returns fine).
The fireworks show starts promptly at 9:30 and is possibly the best I’ve ever seen. There’s a laser & light show on the mountain, there’s a Civil War tribute, (there’s ads for all of Georgia’s major sports teams), there’s music and of course fireworks. The actual fireworks where downright amazing; you get a double-echo from the Bang! works, one directly and one bounced off the mountain. They used plenty of the big fireworks and absolutely tons of rising sparks kind; the entire mountain was a sheet of fire for minutes at a time. The finale left us breathless.
Unwinding back to the camp was a slow but uneventful crawl; I’ve sure we beat the campers on foot (who had to wait for the river-taxis and the report was to expect a 2.5 hr wait). Eric took his to charges home and we collapsed tired but triumphant for a full nights sleep.
It’s another early morning drive, this time we’re heading to San Antonio and then on to Luling TX for more relatives. We’re still marching on through the great desert Southwest, but there are more signs of green now. Some trees mixed in with the sage, and less cactus.
The ride to Luling is long but uneventful. We give Luke another turn at the wheel. The road is calm enough that we let Luke chug on for miles, and then we’re heading into San Antonio. Suddenly the world is full of crazy drivers! People are cutting in front of us, or darting around, or force-merging (on short merges) and giving us no space. Luke brakes as he can, but we’re an 8 ton vehical! We take at least twice as far to stop as a car! We finally make it to a parking lot.
We have a great dinner in San Antonio with Grandpa & Grandma Weiner, and then we have to brave rush-hour traffic. Shelley takes the helm this time, and a good thing too. I’ve never seen such craziness. We watched a pickup 4-wheeling it over the burm to cut traffic (and yes he set the dry grass on fire, we watched the smoke rise for a long time), we had endless numbers of people fight tooth-and-nail to get in front of us, only to switch lanes back a second later when some other lane had a slight advantage. We have a little
sporty thing flash over from left to right, with us doing 60, with less than a foot spare across our bumper! It was all over in an instant, and he missed us, but another foot and we woulda crunched him big. It was a grueling two hours to get out of S.A. Luling, where we spent the night at Grandpa’s house, was great. We yakked all night while the kids worked out their cabin fever. All in all, another fabulous Grandparent visit.
Next morning, crack-o-noon, we headed out for my sister’s place in Katy (really far west Houston). It’s another straight shot down I-10, and I-10 is in pretty good shape even out to Luling; as we approach Houston it widens to 6 lanes. We start watching real weather appear; there’s a line of heavy thunderclouds forming up to the left and right of us and we’re heading right for them. The wind starts to pick up and really buffet us; we slow down to 60 and then slower. People are starting to park on the side of the road, but we want out of the impending storm. Rain alternates between slashing and nothing. The clouds get dark, low and ominous. I start to see green clouds, and clouds moving the wrong direction. I pull out Shelley’s “smart phone” and look up the local weather. Sure enough, with vast modern technology, 4G wifi, low-power android-enabled cloud-backed internet weather smart-phone tech we discover what we already know: there’s two large thundercells on either side of I-10. They happen alot during south Texas summers as warm wet Gulf air meets cooler midwest air. And these storm cells often spawn tornadoes. But after 20 mins of staring at awe-inspiring clouds and getting slammed by 40mph cross-winds we manage to roll through the middle of them and out the other side. The rest of the trip in is entirelty uneventful, except for the trip down memory lane for me.
We get to my sister Ruth’s without incident and my kids rush in to play with her kids. Then we have a comedy of errors trying to get power run to the RV. First our old power cable gets hot and the RV power cuts off (which means the AC cuts off on a hot humid Houston summer day). Then we think the outlet is bad, then we try to test the outlet with an old drill (drill not working), my laptop power supply (cannot see the little blue light in the sun), and finally a real tester (outlet is dead). We switch outlets, then Aunt Ruth tells me the switch for that outlet is flakey, and it surely is; we quick-cycle the RV AC repeatedly without realizing it, and pop a 15amp house breaker. We change outlets again, we change power cords again, we run the new cord through the garage to an internal 20amp circult, and finally it holds. The RV stays well AC’d for the next 2 days.
Grandma’s over (*my* Mom this time) as she lives a few miles from my sister. And we hang out and visit all day. There’s wine & lasagna for dinner, and hot showers and full beds for all.
We all sleep in late. We have pancakes & bacon for breakfast. We run a few errands and then see the movie Brave (which is really good, BTW). I end up connecting with an old college buddy and her boyfriend (Facebook!) so we invite them over for dinner. Turns out the the boyfriend is also an old college friend, so suddenly it was Texas A&M U reunion night. They are both divorced with one teenaged daughter each (compared to my 4), and enjoying life again after divorce. We have a long evening of beer, hotdogs and college memories. The kids Xbox continuously, or get their internet “fix” or play on the trampoline, or have drawing contests or otherwise monkey around. It’s a really great “down time” lazy day.
Next day we take a lazy breakfast and then decide to visit the Biosphere with Grandpa. We head out to Nessie and observe the new tire is looking mighty flat. Humm…. we hook up the air pump… and it’s at 80 psi, spot-on the normal max pressure. Looking further: the inside tire is flat. CRAP. But wait! It really WAS fine in Casa Grande, I checked it before we drove off. And pretty quickly it’s clear that the rubber value stem is leaking, probably
banged too hard during the change and now it’s going flat overnight. We call up Ed. He agrees to change it under warranty… but he doesn’t want to drive out to meet us. But he Does The Right Thing, and calls GCR Tire, a Tucson local who WILL come out to Grandpa’s. Ed’s covering the whole cost. So now we’re basically stuck at Grandpa’s waiting on GCR Tire (who’s promised to get there “in an hour” and it’s already 10am).
Meanwhile something triggered in Shelley’s brain about tires aging, so I go read up on them. I learned something new today: all tires age. After 6 years you should replace them, completely independent of tire wear. Pretty much no tire is expected to last 10 years except under “ideal” circumstances. And tire manufacturers have to stamp the date of manufacture on the tire, so you can tell how old your tires are. (read up on it, but it’s the week & year of manufacture as a 4-digit number in an oval after the “DOT” stamp). So we go look at our tires (mostly meaning Shelley crawling under the RV in the 110-heat to read between the dualies). Sure enough, the youngest tire is 8 years, and the oldest is 12 years old. Good tread, but expected to blow at any moment.
Crap, crap, crap. Another round of planning & family voting. We decide to limp over to Big-O tires to replace the remaining 5 tires, never mind fixing the old one. GCR Tire shows up for the repair while I’m finishing negotiations with Big-O (and yes I asked GCR and no they did not have the tires we need in stock). So the GCR guy politely fills our inside tire (it’ll last maybe an hour) and we roll over to Big-O. We drop everybody off at Costco, where we do some shopping and eat a delicious Costco lunch (which is actually pretty dang cheap and a decent enough hot dog), and wait 2 hours for me to blow another $900 on tires. After a while we’re back at Grandpa’s house with 6 brand-spanking new tires, waiting for a thunderstorm to pass before we go swimming. It’s too late for the Biosphere, that will have to wait for another visit.
The thunderstorm takes too long to pass and we miss swimming also. We have some more family over for a nice dinner, then we hit the road again for more night driving. This time we’re heading for Carlsbad Caverns. It’s a long haul out of Tucson but utterly uneventful. We even give Luke (19 yrs old!) a turn at the wheel. He’s a natural driver and handles this big rig fine. We make a long drive of it but Carlsbad is just too far to make in one day. We end up in the backside parking lot of a Walmart somewhere just inside the Texas border (Walmart mostly has a “RV friendly” policy). It turns out that while our GPS has many useful features, finding RV campsites is not one of them. Also when we turn off I-10 and head into the countryside we lose all cell phone service and can’t call ahead.
It’s a 3hr early-morning drive or so to the Caverns. We get there just before the heat starts getting oppressive again. This time we decide to leave the generator on and the AC running while we spend the hot part of the day underground. I used to see this all the time and wonder about it: RV’s with the generator going constantly. Now I get it – Nessie will be in tolerable shape when we return to her, but without the AC Nessie would heat up like a tin
box in the hot sun.
Several of my kids are really nervous about entering the Caverns; they’ve had some scary cave experiences in the past. We have to gently encourage several down the switchbacks into Carlsbad, but they master their fears and soldier on down into the cool cave air. Carlsbad does not fail to deliver. The Caverns are immense on a scale that’s hard to imagine; all of downtown San Jose could comfortably fit in them. The trails wander on for miles in there (the sections closed to the public are probably 100x larger than the miles of public sections). There’s a section where the roof soars over 300ft overhead and single rooms covering many acres with lines-of-sight of perhaps a quarter-mile underground. And it’s all a fairyland of cave growths and little pools, with eerie lighting everywhere; flowing stone sculptures with names like “Temple of the Sun” or “Doll Theater”. For the younger generation: it’s the largest Minecraft cave you’ll ever see.
We ride the 800ft (!) elevator lift back to the surface and decide to stay for the evening bat swarm (it’s still to hot to drive). Every evening at dusk between 250 thousand and a few million bats leave to go eat mega-tons of insects up and down the local rivers (the numbers fluctuate so much because the bats migrate frequently). We hang out in the local gift shop & cafe for a few hours (always a bad plan when on a budget), then try to watch a movie in Nessie (AC keeps it tolerable in there, but it’s still pretty warm), and finally evening rolls around. We settle in to listen to the rangers and then finally the main show: 250 thousand bats fly out of the cave like smoke on the wind. There’s a faint odor of bats in the air, and an endless murmuring of chirping bats and the little winged creatures are flitting everywhere overhead before flying off the escarpment edge and off into the darkness.
We do another (not so long) night of driving, stopping at midnight in Fort Stockton, TX. We get a longer nights’ sleep tonight, even if the location isn’t as glorious.
Today’s the day for the start of our epic 3-week 7000-mile cross-country RV trip of doom! I’m up (fairly) early as I need to pick up all my kids – and their extra clothes, toiletries, games, meds, etc – by 9am. Then I take them back to my house to begin packing in earnest, except for Josh who I need to take to the eye doctor’s to replace his glasses (broke under warranty) and Laura – who left her drawing pad behind. I also need to drop my ex-Sprint AirWave back at the UPS store, and go by the pharmacy for a month’s worth of meds, and get fresh fruit for the RV, and… and … and … you get the picture.
Meanwhile Shelley is busy doing last-minute packing of Nessie, our 7-ton 31′ Class C RV – all the fruits & veggies & cold-stuff go in in the last minute. While I’m running around frantically driving kids all over creation, Matt figures out he’s got a total of 3 pairs of underwear at my place, so Shelley is out driving him to get some undies (and other stuff we need) while I’m running my errands. Despite all the crazy start and hasty lunches we actually hit the road as planned right at noon.
So on this trip we have: Me & Shelley (a red-head), my eldest daughter Karen, Luke (another red-head), my son Josh, my 2nd daughter Laura (also a red-head but no relation to Shelley) and my youngest Matt. We’re off to see the country and all my scattered relations. I’ve got my Dad (& Jane) living in Tucson AZ the kids’ other grandpa Zade in Luling TX (outside of San Antonio TX), my sister (Aunt Ruth) and mom (Pat Ireland) in Katy TX (outside of Houston), my brother in Atlanta GA, and my Uncle Bill and his 4 daughters (all my age) and their 15 kids (all my kids’ ages) in eastern Connecticut.
We’re starting out of San Jose, heading over Pacheco Pass to I-5, then south towards LA – but we badly do NOT want to hit LA right at rush hour, so we eventually cut over to Bakersfield and then follow some long long slow farm road across the central valley to Barstow… and up to Calico, a ghost town.
Now when Shelley was a kid, her mom would drive this very road (to visit her grandparents in Vegas) and they would stop by Calico once a year or so. She has some fond memories from her childhood so visiting Calico is somewhat of a pilgrimage to her. We arrive there right at dusk and can’t find anybody manning the entrance booth, so we sheepishly drive (our 31′ RV) quietly into the town – and promptly find the RV campground. It’s basically deserted (there’s 1 other camper there, and space for maybe 100 vehicles), has power hookups and bathrooms with showers and running water… and it’s free, at least for people
arriving as late as we did. We got out, stretched our legs and enjoyed the beautiful pink sunset over the red red hills, made sloppy joes on Nessie’s stove and ate on the picnic tables in picture-perfect weather. Laura got the neighborhood dogs to howl back at her, Karen & Luke made videos of the epically blowing Laura’s hair, Matt climbed the hills and Josh & I ninja-sparred.
It was a picture-perfect ending to the 1st day.
We walk though Calico the next morning. It’s cool desert morning air, with some wonderful history. The town’s been cleaned up a fair amount since Shelley was last there but remains a really nice tourist trap. Mission accomplished, we head out for the long hot desert drive to Tucson to visit my Dad (Grandpa). It’s a *long* boring drive down I-40. Shelley is an awesome long-haul truck-and-horse-trailer driver, so driving this RV thing is a piece of cake. (and while I’m up getting Shelley a nectarine, Laura types in my blog: “Moo” and “He has yet to notice.”) Karen is talking about whale sperm shampoo (*not* sperm whale shampoo)… and the generator cuts out – it’s overheated. That means the main compartment AC cuts out. Oh – did I mention that on the long uphill grades the cab also AC cuts out? (I assume because the engine is working too hard?). So we pressed on in the 110-degree heat, across I-40, down “highway” 95 (looks like asphalt thinly spread over desert dirt, there’s a whole lotta “dips in road”). Back on I-8 and heading west, and 2hrs out of Tucson and we’re all baking on-and-off (as the cab AC cuts in and out, and the cabin is slowly climbing above 90degrees)… when we blow a tire.
Yup, 20 miles from Nowhere, AZ, down that long & lonely road… we suddenly pick up a shake & shimmy… and a list. We hove Nessie over to port and off the side of the road. I tremulously step out to survey the damage. Outside right rear tire has blown big, completely come apart. It’s 1 of a dually, and the other is squashed under the load but holding. Time for some quick thinking; we are baking and a long way from anywhere… and lame. We check the phones: we have cell service; Thank You T-Mobile. We call AAA. They don’t do RV tires but they do give us the number of RV Medic in Casa Grande… which isn’t open after hours. We get the answering machine & another number to call… also no answer. So now we’re calling all about (at least 3 phones making active calls at this time, plus Google map’s are in hot action). We decide to limp into Casa Grande. We dump the tanks (not the black!) and push the kids over to the “good” side to lighten the load. We also batten down the hatches, as Shelley points out that if the remaining tire blows we’ll “drop hard”. Casa Grande is about 20miles down the road, and we decide that 40 mph is probably a good max-speed so we start off.
Then the dust-storm hits. NO I AM NOT KIDDING. We’re lamely limping along when the wall of dust hits, obliterating the “Blinding Dust Storms” road sign. So now we’re limping blindly along getting buffeted by 40mph winds and dust (and tumbleweeds ARE blowing by, queue lonely wild-west music please) when the rain hits. Yes: thick dust on our windshield AND ITS RAINING NOW WITH THE BLOWING DUST. Nessie soldiers on. 20min later we pull off I-8 and out of the storm and head down some lonely farm road… but with the lights of Casa Grande clearly in the distance. We pull into the first big lot we see (Big Tires empty lot) step out and see a rainbow. Back around to calling RV Medic we get a human, we tells me to call Ed W’s who DOES do after-hours work. $200 minimum charge. Ed (who also requires 3 or 4 phone calls to reach) promises he can work on us, but can we get to town? No problem.
While we wait at Ed’s shop for an hour (his mobile guy is on another call), Grandpa & Grandma drive up from Tucson and take the 3 younger kids back to their place and feed them all manner of treats. We (4 remaining) older kids mosey over to a nearby restaurant and get dinner and some heat relief. Another hour later and I’m $400 poorer and sporting a brand-new tire. We pile in and make it to Grandpa’s. Much sighs of relief, and a good nights sleep was had by all.
It’s been a freak’n month since I last blogged! Where’s the time gone???
Mostly I’ve been furiously coding. ‘wc *java’ of our ‘src’ directory now reports 31500 lines. We’ve cleaned up and CSS’s the web interface. We added LevelDB to handle zillions of small K/V pairs (larger ones go to the local file system directly, and of course we still handle S3 and HDFS natively (either using an existing hadoop install, or directly *being* a distributed hadoop)). We’re still 100% peer-to-peer, even for the direct HDFS stuff. Last week I hacked a concurrent Patricia Trie (leaving the making of a *distributed* concurrent Trie for later, but now I know how to do it…). Then we ran all 36Gig of Wikipedia data through WordCount, using that Trie – it took less than an hour on 1 node.
This week it’s about running a Linear Regression *distributed*, using distributed Fork/Join as the programming paradigm. Also, integrating a HashMap-in-a-Value (so we can pass about & maintain the Map interface in the Value piece of our K/V store – think: distributed JS objects), plus the final bits of VectorClocks (all behind the scenes; the VCs will let us do atomic update and strong coherence of Keys but they’re a horrible API to expose). We’re building a toolkit approach to solving the problem of building a reasonable database over the Cloud. Either (distributed) Patricia Tries or (distributed) Concurrent Skip Lists for range queries, plus JS-like objects in Values, plus atomic (transactional) update of individual JS objects using a Compare-And-Swap like approach (instead of locking: CAS is much faster under load, as threads can optimistically make progress).
More on all of the above later this week – as we have a hard deadline to finally *open* our Open Source project. Yeah, yeah, yeah, I’ve been hassled plenty about calling ourselves Open Source and not (yet) having any open source… we’ve been trying to get the basics done first… but the real news: I’m finally going on Vacation!!!
Yes, Nessie, the 31′ 7-ton Class C RV of Doom is being prepared for our 7000 mile Epic Cross-Country Journey. I’ve been wanting to do this for a decade now: take the entire clan (7 of us!) across country, touring all the junk tourist traps we can and visiting our scattered family as we go. We got family in Tucson AZ, San Antonio TX (well, Luling really), Houston, Atlanta, DC area, and Connecticut. I’m giving an invited lecture at UIUC on our way back, and have been assured I can use that lecture as a reason to declare this a “business trip”, and deduct all the gas and mileage costs – I figure about $3500 in gas alone. We stopping at Stone Mountain in GA over the 4th of July, visiting my brother and camping at the lakeside facing the mountain where we’ll watch the fireworks and lazer show from the RV roof. We’re going to visit Carlsbad Caverns. We’ll pass through DC and maybe attempt the Smithsonian (not sure about that one; depends on the schedule and how badly I want to fight the RV through DC traffic). We’re visiting my Uncle’s classic family farm in Connecticut where my 4 cousins live – all my age, all married with 3 to 4 kids each… all about the same age as my 4 kids. We’re talking now about 15 to 20 neices and nephews, plus Aunts & Uncles galore, and of course pigs and chickens and horses. It’ll be a regular zoo.
So if you see a large white whale heading east on I10 with a frazzled Shelley or my excited 19yr-old at the helm, honk, wave Hi and give us a wide berth…
Quote(s) of the Month from Kevin Normoyle (Sun/Sparc & Azul L2 Cache Designer Extraordinaire, Cache Coherence Advisor to 0xdata):
Reminds me of CS101, on one of my first programs. The grader wrote in big red letters over my big comment block:
“Don’t document your bugs, Fix them”
So I asked Kevin if I could quote him, and I got this response back:
ah that’s fine…I spout “Advice” left and right to everyone… Many dismiss it as “Rant”. There’s always that fine line between being a Prophet, and just another crazy guy standing on the corner yelling. One could argue that everyone who every posts to Twitter is an “Advisor” of some sort, to the world.
Sound advice, from a (reluctant) adviser to the world.
The Diablo3 bomb blew through my house this week, destroying work schedules left and right. Every kid (& Dad) played hours of D3. OMG’s, I can remember D1 – way back in ’96 before the Diablo’s were numbered. I must be older than dirt. Also, being CTO of 0xdata means a zillion customer visits last week (thanks to our plugged-in CEO Sri). Git claims 600 lines of code from me, down from my weekly average of 3000… blah. Coding is good for me, I need to do more!
Meanwhile, work at 0xdata is actually proceeding really well despite my lackluster week. We’re reading & writing HDFS natively. As I write this, we’re now able to read & write S3. We’ve got the semantics and design of what is basically the Java Memory Model ironed out for the Cloud (although the implementation is still being worked on). We’re starting to launch Paxos-based H2O clouds in Amazon EC2. We’re running larger test suites.
What little coding I did was relating to making Key-delete work right. The issue is racing Puts followed by Deletes, and delivering a strongly consistent answer when UDP packets are getting lost or re-ordered. A late-arriving Put cannot “resurrect” a deleted Key and that requires keeping some VectorClock smarts on the deleted Key, instead of just removing all knowledge of the Key.
We’ve got the Git repro opened up to a handful of people and we’re debating when to open it fully. I’m voting for “wait a little longer”; in particular I want to iron out the design of the execution engine more. I.e., “word count” on HDFS should not just run fast & well, it should look good also. I might get overruled on the timing of this, but in any case look for our Git to open up “soon” – some weeks or less.
In other news, I got my $500 deductible returned to me from AllState (which they got from the other drivers’ insurance). We sold my fiance’s junker car and upgraded her to a car with only 70K miles (down from 225K miles! The unkillable Nissan Maxima’s brakes finally failed). I switched the family over from Sprint to TMobile – it’s a better family plan (for me anyways), and that means I finally upgraded my antique phone… to another antique! Yes! I managed to dodge the smart-phone brain-drain that’s got all my colleagues one more time.