Tuesday, January 26, 2010

Operation Aurora - 'Obscure' CRC Code May Not Be That Obscure

Via The Register UK -

An error-checking algorithm found in software used to attack Google and other large companies circulated for years on English language books and websites, casting doubt on claims it provided strong evidence that the malware was written by someone inside the People's Republic of China.

The smoking gun said to tie Chinese-speaking programmers to the Hydraq trojan that penetrated Google's defenses was a cyclic redundancy check routine that used a table of only 16 constants. Security researcher Joe Stewart said the algorithm "seems to be virtually unknown outside of China," a finding he used to conclude that the code behind the attacks dubbed Aurora "originated with someone who is comfortable reading simplified Chinese."

"In my opinion, the use of this unique CRC implementation in Hydraq is evidence that someone from within the PRC authored the Aurora codebase," Stewart wrote here.

In fact, the implementation is common among English-speaking programmers of microcontrollers and other devices where memory is limited. In 2007, hardware designer Michael Karas discussed an almost identical algorithm here. Undated source code published here also bears more than a striking resemblance.

The method was also discussed in W. David Schwaderer's 1988 book C Programmer's Guide to NetBIOS. On page 200, it refers to a CRC approach that "only requires 16 unsigned integers that occupy a mere 32 bytes in a typical machine." On page 205, the author goes on to provide source code that's very similar to the Aurora algorithm.

"Digging this a little deeper though, the algorithm is a variation of calculating CRC using a nibble (4 bits) instead of a byte," programmer and Reg reader Steve L. wrote in an email. "This is widely used in single-chip computers in the embedded world, as it seems. I'd hardly call this a new algorithm, or [an] obscure one, either."


The claim that the CRC was lifted from a paper published exclusively in simplified Chinese seemed like the hard evidence that was missing from the open-and-shut case. In an email to The Register, Stewart acknowledged the CRC algorithm on 8052.com was the same one he found in Hydraq, but downplayed the significance.

"The guy on that site says he has used the algorithm, didn't say he wrote it," Stewart explained. "I've seen dates on some of the Chinese postings of the code dating back to 2002."

Maybe. But if the 16-constant CRC routine is this widely known, it seems plausible that attackers from any number of countries could have appropriated it. And that means Google and others claiming a China connection have yet to make their case.

The lack of evidence is important. Google's accusations have already had a dramatic effect on US-China relations. If proof beyond a reasonable doubt is good enough in courts of law, shouldn't it be good enough for relations between two of the world's most powerful countries?


  1. Excellent follow up. I haven't seen this elsewhere on the roughly 200 blogs that I follow. Very good info.

    It appears confirmed (IMO) that the payload Symantec is reporting on: http://www.symantec.com/content/en/us/enterprise/media/security_response/whitepapers/inside_trojan_clampi.pdf is dropped as part of the Aurora intrusion set (the details: hosts/exploits) line up. But the Malware is well known and established. It's also believed to originate in Europe or Russia.

  2. When this story first broke, I don't think anyone expected to find a rock solid smoking gun. The people that pulled off this attack, and perhaps many others, are skilled and smart...but like any good attack, they are lazy.

    They follow the path with the least resistance while still achieving the goal.

    If they don't have to write custom code to get the job don't, they aren't going to waste their time. If they can hack you and steal your IP with a 4 year old PDF exploit, that's what they are going to use.

    So looking at the malware might helpful as a piece of the grand puzzle, but as a origin smoking gun, it is less than optimal IMO.