602 HIKES and counting

Top Stories

Lava Kafle


FAWN: Fast Array of Wimpy Nodes for Sun Oracle, Google, and Facebook

FAWN: Fast Array of Wimpy Nodes for Sun Oracle, Google, and Facebook:
by mbenedict October 16, 2009 7:12 PM PDT
It’s not just about the raw cost. There’s a finite amount of electricity you can bring to a data center, so at some point the number of queries you can do per kWh becomes very important. The article mentions heat as waste but like electricity, heat itself also becomes a limiting factor in a large data center. There’s only so much cooling capacity available beyond which you get severe diminishing returns.

So a system which promises to be more energy efficient and runs cooler at the same time… that could be a big win.
by symbolset October 17, 2009 8:19 PM PDT
I’ve been a proponent of FAWN for a long time. For ten years the software has provided the redundancy and the scale. FAWN is not the right answer for every problem, but no tool is.

Configuring the right solution for massively parallel problems is a fairly complex geometry. If you approach a large-grain problem from a cents-per-compute-per-second perspective then FAWN is a slam dunk. For fine-grain problems you want to use GPGPU instead. When the problem becomes large enough, custom system boards and esoteric processors enter the solution set.

It’s really only when you don’t know the granularity of the problem, or you need a general solution that solves both ends of the granularity scale and the middle too that Industry Standard architectures are ideal. In these cases a mixed cluster of wimpy nodes combined with GPGPU nodes may be more cost effective.

Oh, and about cooling: The answer to many problems that start “How do you…” is… don’t. As many have shown the correct answer to the cooling problem is not refrigeration, it’s location, location, location. Your servers are rated to 35C (95F) at least, and if the ambient temperature where they are rises above that, you located your servers in the wrong geographic area, which is a different problem. There are lots of places you could put your servers that won’t get that hot in the next decade. Put your servers some place where the ambient temperature never goes out of range, preferably where they have cheap power (I hear Canada is nice). To find the ideal operation for the fans of your datacenter, heat the inlet temperature to 35C. Fire up the equipment and stress test it at maximum capacity. Measure the outlet temperature. Now you have the ideal outlet temperature. Regulate the fan on the exhaust such that the exhaust is consistently that temperature, less a few degrees for safety, and your server components will remain at a consistent temperature (thus preventing swings in temperature which can cause problems). This is not as complicated as you might think. As an added benefit during a “heat wave” stationary inversion the thermodynamics of a hot exhaust plume exiting high above the building plus the related ground-level cool air inlets creates a cooling breeze which diminishes the air conditioning required to cool the humans in the related office spaces when they’re not in the datacenter. Don’t insulate the datacenter part either – that’s swimming upstream. Maintaining a snow load on the roof should not be a design goal. Also, in really intemperate climes filter the exhaust and pass it through the human workspaces (or if you’re really fussy, use a heat exchanger) – the servers are heating air, there’s no sense burning extra energy to heat separate air to keep the humans comfy.

by ckurowic October 18, 2009 11:01 AM PDT
I disagree with your point of recirculating the hot air from the servers to people’s work areas. Some are VERY sensitive to the outgassing that occurs when equipment is new (and even for many months afterward). You have interesting concepts, but I’m afraid you don’t have the engineering background to support it.

by Christopher_Mims October 20, 2009 11:47 AM PDT
Great article – provides a lot of detail that didn’t make it into my own write-up of FAWN for Technology Review. If you’re interested in a slightly different take, though:

“We were looking at efficiency at sub-maximum load. We realized the same techniques could serve high loads more efficiently as well,” said David Andersen, the Carnegie Mellon assistant professor of computer science who helped lead the project.
It’s not just academic work. Google, Intel, and NetApp are helping to fund the project, and the researchers are talking to Facebook, too. “We want to understand their challenges,” Andersen said.
Cut the power
These large-scale systems don’t come cheap. Besides the hardware, software, and maintenance costs, there’s power, too–and companies often must pay for energy twice, in effect, because servers’ waste heat means data centers must be cooled down.
by catbutt5 October 16, 2009 11:51 AM PDT
Oh, I needed a good laugh…

“And addressing the brains… Anil Rao is one inventor on a … patent applied for a computer system with numerous independent processor modules that share access to shared resources including storage, networking, and boot-up technology called the BIOS.”

Trying to patent something that’s existed for more than 20 years are you? Good luck with that.

Anil, ever heard of Sun or IBM or companies that sell refrigerator sized (small and large) computers full of little card slots containing memory and processors (even at different frequencies) that share storage, networking and yes, even the BIOS? It’s the same concept.

What’s your act 2? Gonna try to patent the automobile?

by kirkktx October 16, 2009 12:54 PM PDT
“52 queries per joule of energy compared to 346 for a FAWN cluster”

Somewhere I saw that electricity costs exceed hardware costs amortized over the life of the computer. These numbers should certainly attract investors.


101 thoughts on “FAWN: Fast Array of Wimpy Nodes for Sun Oracle, Google, and Facebook

Leave a Reply