Monday, December 17, 2007

This blog is first in Google search!

I just noticed that Google is giving high search rankings to posts in this blog.

I am particularly proud of one search query at which this blog shows first:

w-h-y
a-m-d
d-o-i-n-g
s-o
b-a-d

The reason I put the hyphens is to obfuscate the words, I don't want Google to point to this article if you put that search query.

The link points to "Does AMD know what it's doing?" written in June, a good link indeed.

The celebration of the second anniversary is coming, I am happy that things are "clicking" for the blog: The market has vindicated the author, I am finding good material to work, the quantity has increased, and the articles have gained much more acceptance. The hike in Google ranking is also reason to celebrate for you, the reader: It means that a truly independent medium to express opinions, not conditioned, mediated or controlled by Wall Street remains open for your passive participation (or, if you are commenting, active!), while it is raising in its significance.

I hope to be able to be up to the heightened responsibility.

Note: I found out about this particular search query because I have a service that, if the visitor comes from a search query, it keeps tabs on the queries.

Thursday, December 13, 2007

Analysts Day

Today AMD will be holding its annual Financial Analysts Meeting. To put the meeting into perspective, I wanted to write several things about what AMD's management has been forecasting and what has been happening. I can understand the optimism of AMD a year ago, after all the period of great successes had barely finished. You can check the review of 2006 in Bob Rivet's presentation to get the feeling. The problem is that they applied the model of great successes for a period on which it didn't apply. As I explained in "important about K10", AMD is in "full lying" mode, it can not say the truth because it is just too horrible, then, it is important for tomorrow to be able to detect lies. To help in that regard, below I will summarize other statements from the company.

Fortunately, Roborat64 already summarized the last meeting for us, he wrote an article about this subject [ formatting changed ]:

[H]ere is what AMD projected for 2007:
  • K10 quad-core ramp: 2H’07; actual result: pushed out possible mid Q1'08
  • Barcelona performance: 40% better; actual result: ~40% worse (non-compliant SPEC benchmarks)
  • CAPEX: $2.5B; actual result: 2007 estimate will be at $1.7B (Fab38 delayed)
  • Revenue (long term target): ~$7.6B; actual result: $6.02B (average analyst estimates)
  • Gross Margins: 50+/-2%; actual result: 35% (last 3 qtrs)
  • 2007 growth: 10% above industry (16%); actual result: -455% [ sic ]
I wish to mention a few things more:

In Mr. Seyer's presentation, slide 16/52, the quadcore was projected to have 40% superior performance, accelerated virtualization, and 60% improved power efficiency. Well, the kind of workloads that the TLB bug affect more are related to virtualization, it can be as much as 50% slower. Regarding the power efficiency, Jason Mick @ Daily Tech wrote a blog demonstrating the lies about K10 power consumption [ thanks to the Intel vs AMD blog for the link ]:
To put [the datum that K10 consumes 137 Watts] in perspective, a 3.16 GHz Xeon X5460 from Intel squeaks in at a still weighty 120 W. While AMD failed to disclose in the white paper on what frequencies its selected processors operate, it is almost surely 3.0 GHz or lower, as 3.0 GHz is the highest speed K10 processor currently demonstrated. The best case scenario is that a 166 MHz slower AMD processor consumes 17 more watts [my emphasis]
[...]
However, if the samples tested were lower than 3.0 GHz, obviously the picture becomes far worse. And since AMD's 2008 roadmap states that its 2.4 GHz processors are rated at 125 Watts TDP, this is almost certainly the case. Architecture and design advantages aside, K10 is a chip that is almost a gigahertz slower but with a significantly higher power consumption rating.
So much for the often repeated superior power efficiency of the "native" quadcore...

We can not forget that last year the company was speaking of fabulous forecasts in December 14, and less than a month afterward, they had to report a miss warning because the fourth quarter was much worse than anticipated.

In Q4 2006 CC (transcript), Mr Meyer's presentation said things like "We need to improve our financial performance relative to what we delivered in Q4. We will do so by delivering improved products, lowering our manufacturing costs, increasing our operating efficiencies across all disciplines, and continuing to grow share". Interestingly enough, things like the employee count hasn't gone down, which means that perhaps there weren't synergies between AMD and ATI, at least from the Human Resources perspective. So much for all Mr. Meyer said. Dr. Ruiz "I am incredibly optimistic and excited by the future of this company, more than I have been in the seven years that I have been with this company": Very well, let me know the next time you feel optimistic, I will gamble against your optimism. Mr. Rivet reiterated the Analysts day guidance, when it already was very clear that the guidance was pure fiction. Also, Mr. Meyer felt "very bullish" that as soon as the "native quadcore" was introduced, it would recapture the performance lead...
I wrote "Catastrophe" regarding the tragic comedy of Q1 2007 CC (transcript), so, I will not reiterate it here.

Then, it is Q2 (transcript). Meyer: "Our Fab 36 conversion to 65-nanometers is complete, with yields exceeding expectations and we now turn all our attention to 45-nanometer" [!!], "We are on a path to bring our gross margins and operating expenses back into a reasonable balance and improve our cash flow". Then, in the Q&A: Meyer: "First of all, we’re very happy with our 65-nanometer yields across all products, including Barcelona, so no issue there. The fact of the matter, Barcelona, while being an absolutely great product, is complicated and it’s taking a little bit more design work than we anticipated getting the final rim in place", Dr. Ruiz: "as Dirk mentioned, on 65-nanometer have been phenomenal, been outstanding". Rivet: "We would really like Q4 to be break-even, or to be specific, not just the month of December but definitely Q4". Then, there is an all-time greatest pieces of bullshit by Henri Richard that I hope someday to write an entire article about:
I don’t know of any IT manager that ever asked what was the nanometer in this processor and I don’t know of any student walking into a store and really wondering what the die size of a processor, let alone in some cases what’s the frequency? What they do is they look at more and more what is this machine going to do for me? How does this look? Is it a fashion statement? Is it responding to my needs?
In Q3 (transcript): Chris Danely, JP Morgan asks: "When do you guys expect to start shipping either at 2.4GHz or 2.5GHz Barcelona?" Meyer: "The plans that we have haven't changed from what we talked about around the timeframe of the Barcelona launch, which is to ship the 2.5GHz product in the middle of this quarter", answering to another question: "Based on the input we're getting from our customers and end users, there is a lot of demand for Barcelona, I tell you. We're just seeing people licking their chops and ready to get their hands on the product".

I am saving the bullshit of "asset light" for the end: Asset Light is empty talk because very probably AMD is forced by contractual obligations to manufacture in Dresden, otherwise the government wouldn't have pitched in. The x86 license caps the number of processors that can be outsourced. If AMD cuts its production scale, then it will suffer worsened economies of scale. AMD's inviability stems from being forced to sell a quantity of products to remain in a given scale, but since the products are mediocre, the only way the market can absorb the quantity is through very steep discounts that induce severe losses.

Since ATI imploded inside AMD, sooner than later the company had to adjust the "Goodwill". Currently, AMD's net tangible assets are negative, that in a way means that the company is worthless. This subject will be covered soon, but since we have the analysts day, I wanted to advise to interpret the statements of today in terms of viability of the company. Today most investors and analysts are not really thinking on an eventual bankruptcy, but as I have been explaining, the crisis may be more severe than what it seems at first sight, so, the question of viability will arise later and today is the day to be preparing for that. I am very surprised of AMD's recent stock price crash, because in reality there are no news, all the latest stuff of the K10 bug, delays, slowness, lacklustre performance, etc., are mere confirmations of things that were very plausible possibilities. Think about what may happen if today you do your due diligence and determine that AMD is inviable, and then, in 9 months time the market begins to seriously question AMD's viability? You would make a bundle!

Wednesday, December 12, 2007

The important thing about K10

I have been reading articles like "Has Intel Crushed AMD?" by Jon Fortt in Fortune's BigTech blog, the Mario Rivas interview [he is the Computing Product Group Executive Vice-President] by Damon Poeter in ChannelWeb, as well as many other numerous discussions in message Boards about AMD, and I grew increasingly frustrated at how people is losing sight of the truly important things about K10.

There is the perception that if AMD solves the k10 problems of bugs, slow clocks, manages to produce them in quantities, then AMD may continue to consolidate its duopoly player status, and continue to be a force in the industry that must be taken into account. I think that all this optimism, very unfortunately, is unfounded.

Let us suppose AMD had launched K10 processors at 3.0 GHz, for both servers and desktops, around June of this year, and without any bugs of importance. Still AMD would be headed down, only not so fast. That is my point. Why? because the K10 design itself proved to be a dud at so many levels that it is exhausting just to mention all of them. I think that the important thing of K10 is that it proved inferior in IPC (instructions per clock) to Intel's existing double duals. This is a fact in a context of two extremely alarming things: Intel's double duals are handicapped by the front side bus (they can't communicate die to die directly, and every new core or processor on the same and the same bus diminishes the effective memory bandwidth per core) and external memory controller delays. Still, despite the handicaps, the double duals beat fair and square any K10 quad at same-clock comparisons in the vast majority of workloads.

This will get much worse, because Intel is already enjoying the advantages of 45nm, over the horizon looms the new advances in transistors, the high dielectric and the metal gates (* see note at the foot), as well as their QuickPath implementation of P2P that does away with the handicaps. I have demonstrated that there is no need to do something as good as AMD's DCA/Hyptertransport because for the vast majority of applications they were used only minimally (just to save a bit of money on the external memory controller, and to reduce the points of failure, helping speed to market), so, the industry has every reason to expect a much more competitive Intel in the short, medium and longer terms.

Does K10 has the room for future improvement? I was very wrong regarding Core, I honestly thought that the P6 line didn't have room for improvement, but Intel proved me wrong. I will try again to formulate predictions, though:

  • I don't think the cache hierarchy in K10 works. The independent L2 caches, half of the total cache space, are inefficient. The L3 level is too small compared to the L2 level to be justified (according to my simulations, for the latency steps on cache level of typical architectures, the sizes should be at least four times bigger than the previous level. These are numbers that I use in high performance optimizations where I try to adapt my software so that the "working sets" maximize the cache hierarchy performance. Also, I have said many times that the L1/L2 hierarchy of K8 behaves more like a "one and a half levels", that's why it is so size-efficient), thus, unless AMD changes this radically, I see K10 underperforming in memory-intensive applications. While the whole L3 is of dubious merit, it still occupies a significant fraction of the processor area, and consumes a significant fraction of its power... Some might say that I am saying this in hindsight, but in reality it is just that the actual performance numbers of K10 have given me confidence to go public with reservations I had from the beginning. I don't find the lackluster performance problem of K10 in any of the important advancements of this architecture (ask Ron, "Cove3" in the InvestorVillage message board for a complete list), but it has to be something, and I think the cache hierarchy may be a partial answer.
  • I don't think the migration towards quadcores will happen fast, not anything close to the migration from single core to dual core. A second core really adds usable computing power for normal Windows usage, as valuable as 70% of the first core, but the third core adds computing power that is hard to use, so it is only 35% as valuable. The fourth core is even less valuable. That's why I am so interested in three-cores, I can really think of ways to use a third core, but the fourth is still too far. This has to do with software engineering and the principle of combinatorial complexity. From the design perspective, the problem with these facts, is that while the single-die principle of K10 is oriented towards maximizing the efficiency of the four cores, it does so at very steep bin-split, yield and complexity penalties. Intel's existing double duals have the priorities reversed: inefficient multicore performance but with quick to market times, ease of manufacture and capable of top clock speeds. By the time this situation reverses, Intel will already be in the market with single-die designs, so, I am afraid K10 won't ever have the chance to be the adequate design for its time, at least from the perspective of multi-cores.
  • The problems we have seen of K10 are not accidental, I fear they are fundamental: The architecture is single die/four core, thus complex, thus requires time to develop, it is error prone, difficult to produce, and hard to make it run at top clock speeds.
I hope to have explained with sufficient detail why I think this is not a circumstantial crisis in the processor business of AMD, but an structural crisis that will aggravate.

The words of Rivas are very contradictory: He implies that the total performance of the processor really doesn't matter for the enthusiast, which is a lie by itself; but yet, the architecture he sells, optimized for multicore performance, is as enthusiast-directed as it gets. He minimizes the performance penalty of the BIOS fix of the TLB bug, contradicting the Tech Report benchmarking (in an article by Cyril Kowaliski that "Chico" asked me to read in his latest comment, Tech Report pounds on Rivas for this), and of course, it is literally brimming with promises of improvements that I don't see how to justify. Rivas is the same AMD official that acknowledged in March that the single die quadcore had been a mistake (Ashlee Vancee @ "The Register"), and digging a little bit more, Rivas, in an interview exactly one year ago, promised a place in heaven regarding Fusion (EETimes, Junko Yoshida), when the company was still trying to justify the ATI acquisition. Read the contradictions of Rivas, that will lead you to conclude that AMD is in full lying mode, presumably because the officials can not say the truth, that is, the news are to become much worse.

I never agreed with the Opteron/K10 comparison. It is true that both are monumental challenges, but that's about all their similarity. Opteron was revolutionary in ways that the industry was prepared to embrace, like the P2P connectivity, the emphasis on setting the way for single-die dual cores; and it was conservative and evolutionary on things the market wasn't willing to change: A true upgrade path for the x86 instruction set architecture for 64 bits, AMD64, while Intel was at the apex of their attempt at consolidating the Itanium ISA. K8 wasn't "marketing driven engineering", that's why it insisted in the technically superior approach of slow clocks of highly optimized execution rather than marketing gigahertz of idle instructions, represented by Netburst. Today, K10 tries to "innovate" in what is not necessary, like the single-die quadcore, the third cache level, etc., rather than innovating in things the industry is desperate for, revolutionary coprocessors for the consumer market, for example; on the other hand, today the risks associated to the K10 challenge are not at all mitigated by Intel's insistence on the incorrect approach, like at the times of the K8 challenge, but quite the contrary, the risks are heightened by Intel's practical and effective approach. Finally, AMD, at the times of the K8 challenge enjoyed the momentum of the superior product design, the Athlon, while today AMD suffers the negative momentum of having the inferior design (thus calling for a more practical approach).

My advice is to be suspicious of the theory that AMD just had a bad streak of problems and mistakes, at least regarding K10, it is very clear that AMD exposed itself to great suffering, and now that the gamble failed, the real pain is about to begin.

(*) AMD, unsurprisingly, is downplaying the silicon process race. Of course, it is so much behind already and getting ever further behind that it has to resort to deny the negative; but this subject is better left for another article.

Monday, December 10, 2007

Scott Wasson @ Tech Report: a mistake

We have been talking about the "Tech Report" coverage of the K10 TLB error.

Scott Wasson published an article where he explains that he made a mistake

I wrote more than once in our coverage of the erratum that AMD had initially suggested the problem didn't affect lower clock speeds of the Phenom. Turns out that's not the case. Here is the text of my notes, verbatim:
TLB problem w/virtualization
2.4 will have the complete fix
Have to enable something in the BIOS for the 2.2 and 2.3
Can degrade perf a little bit
[...]
I think I may have read this incorrect information online somewhere
I also followed internet sources that said that AMD explained that the Phenom 2.4 GHz had a bug and that's why it was retired from the market, but that the same bug wasn't present in the slower versions. The Inquirer may have been the culprit,
This problem was found during speed-binning the B2 revision processors, and this was the cause for the Phenom FX 3.0 GHz delay. It turns out that some CPUs running at 2.4 GHz or above in some benchmarking combinations, while all four cores are running at 100% load, can cause a system freeze.
[...]
9500 (2.2 GHz) and 9600 (2.3 GHz) parts are unaffected by the errata [ my emphasis ] Some 9500/9600 parts may even be overclocked to 2.6, 2.8, 2.9, 3.0 GHz and they will have no problems whatsoever, while some will have this error.

I thought it was important to correct this at the "new-post" level rather than merely an update to old articles or a comment. This emphasizes the importance of having the sources properly linked to, you can backtrack the origin of your assertions.

Thursday, December 06, 2007

Impact of BIOS patch of TLB Errata 298 measured

Phleanom(TM) logo
Scott Wasson, whom we have quoted before, wrote an article for TechReport whose conclusions state that the performance hit of the BIOS patch for the erratum 298 is as severe as 20% in average. Even while taking out of the benchmark mix the memory performance tests, the performance hit is still more than 13%. Then, it has been confirmed the initial assessment of 20% performance penalty and that AMD once again tried to misled the public into diminishing the importance of the bug.

The 'net is abundant on reports on how the Phenoms are slower than plain old K8s in certain workloads. Since most of the consumer applications are very low-threaded, Phenom doesn't really have many chances to out compete their K8 dual core brothers throwing more cores to the workloads, but now that the BIOS patch castrates them of their memory performance, they look truly horrible. In some of the very long comparison tables of Wasson's article, the Phenoms are last in performance, by large margins. The BIOS patch affects severely the only competitive edge that K10 has over Intel products, so, the comparison turned hopeless against Intel processors.

Who are the suckers who are buying these Phenoms?

I wish I could leave it at that. But I can't. It turns out that at the height of the crisis, AMD officially came out to say that they are shipping the hundred of thousands of K10 processors they guided the last quarterly report conference call [ Mark Hachman @ ExtremeTech reports that AMD personnel emailed statements with that information ]. On top of the desperate and unethical behavior I describe in "terrible news", I can't fathom how stupid this company may be:

We know that AMD is quite simply not selling all the K10 it was supposed to sell [ we know that they are performing "application screening" before actual shipment of Barcelonas, they never launched the expected 2.6 GHz Phenom, had to retire the 2.4 GHz, IBM never launched the systems to the public so it couldn't certify its benchmark ], so the statement of tracking in accordance to previous guidance must be an outright lie; but that is not the worst about this statement, AMD actually believes that people will interpret the information that they are selling hundreds of thousands of severely defective processors as good news...

Wasson says that since the performance of Phenom is so mediocre, its only redeeming quality may be the cheap price, so, some average consumers may be interested in it, but

I doubt whether the average sort of consumer is likely to purchase a system with a quad-core processor. One wonders where that leaves AMD and the PC makers currently shipping Phenom-based PCs. I'm not sure a recall is in order, but a discount certainly might be. And folks need to know what they're getting into when purchasing a Phenom 9500 or 9600-based computer".
[...]

[A] credible source indicated to us that at least some of the few high-volume customers who are still accepting Barcelona Opterons with the erratum are receiving "substantial" discounts for taking the chips [...] I doubt AMD would have shipped Phenom processors in this state were it not feeling intense financial pressure.

AMD's other major concern here should be for its reputation [ my emphasis ]. The company really pulled a no-no by representing Phenom performance to the press (and thus to consumers) without fully explaining the TLB erratum and its performance ramifications at the time of the product's introduction.
It is even worse, Wasson forgets something he mentioned that I already quoted: AMD also misled the public by telling early reviewers that since the external bus of Phenoms was going to be 2.0 GHz, they should set the external bus to that speed for their reviews, while in fact the external bus of the actually launched Phenoms are 1.8 GHz.

I think all of this deserves REPUDIATION.

For a slight touch of comic relief, follow this link.

gfor said of Dave Orton

gfor left a comment in "A-TItanic comments" that I want to share with all the audience:

You negative comments regarding Orton are misplaced. As the CEO of ATI, his first and foremost responsibility was to ATI shareholders, and he took excellent care of them.

1) He managed to get $5.4 billion dollars for a company that, had it not been sold, was hading for ~$2B market cap by April 07 based on dismal profits and R600 fiasco.

2) He knew full-well what a disaster AMD-ATI would be and did his shareholders enormous favor by demanding cold hard cash. The fact that the outside people most familiar with AMD's finances (ATI management team) did not want to touch their stock with a 10-pole should have set off red flags all over. "We will create a dominant company... yeah, and we don't want to be paid in it's stock".

Orton's ability to get AMD to overpay by the factor of two for his company, and pay the bulk of the sum in cash (which they could sure have used now) is nothing short of a genius. The man was looking out for ATI shareholders and he took great care of them.
According to the wikipedia entry on David E. Orton, he "enthusiastically supported ATI acquisition by AMD and was one of the main forces behind it". The history of this business deal is well understood, so, I won't write contemporary history tratise, but I think that Dr. Ruiz and the rest of the managerial team that approved this catastrophic acquisition did it in good faith, but they were scammed by Wall Street and the ATI personnel into paying twice what ATI was worth in an unnecessary acquisition. They are that naive and they have such inferiority complex that anyone who says "you could do X that Intel can not" will get their attention, even if X is the most stupid thing in the world. [ This inferiority complex also manifests into the sickening submissive attitude towards Microsoft ].

I wrote about this and the broader subject of big mergers in "Big Merger=Bad Merger": The deals receive excellent financial media press, because Wall Street and their Investment Bank branches stand to gain hundreds of millions of dollars in commissions and other fees, so they have every incentive to use their considerable leverage on the financial media to give the big mergers good propaganda. Also, the managers of both companies get richer, a phenomenon explained in links in that article.

This is like departing tenants throwing a party: There is lots of excitement, people come and have a good time, but by the morning, the owner is faced with a mountain of trash, vomit in the floors, the toilets clogged, the garden all trampled and all sorts of weird stuff. The owner is, naturally, the shareholder of the acquiring company.

Wednesday, December 05, 2007

Erratum 298

The error in K10 that has generated this flurry of controversy is called Erratum 298. I will explain what it is about below, but I first want to put this problem on what I think is its due context.

In "terrible news" I spoke about AMD launching Phenom knowing about the existence of this bug. Because of technical characteristics of the bug and the Linux patch that works around it, I think that the BIOS patch can also work around the problem, therefore, this is not a problem that grants a product recall. Nevertheless, the performance hit of patching a system through the BIOS may be very significant, AMD claims around 10%, independent testers claim around 20%; but it seems that if the Operating System can be patched too, it only hits 1%. In practical terms this bug and the patch are as if AMD would have launched processors 10% slower.

According to "Daily Tech"'s Kristopher Kubicki, AMD halted shipping of K10 pending "application screening", that is, AMD is checking whether the applications of a customer would likely trip the bug or not before shipping. It seems that the bug may only occur when the operating system needs to set the "Accessed" or "Dirty" bits of the page table entry [ I found this article for the people interested in learning about Paging, the meaning of the accessed and dirty bits is explained there ]; like I mentioned in "terrible news", some workloads like supercomputing may not trip the bug, the reason seems to be that supercomputing doesn't do very sophisticated virtual memory management, at least not as complex as virtualization, so the simultaneous conditions required to trigger data corruption or system crash may not occur.

This means that the flow of Barcelona processors to the market is slower than anticipated, and some other customers that chose AMD because the specific advantages of AMD processors for workloads like virtualization are not receiving any product at all. In the case of "consumers", it seems that the company will give the chance to disable any patch and have a buggy system, or take the 10% to 20% performance hit.

Now that we come to that, the choice of disabling key functionality of the L3, Kubicki also quotes AMD saying that some tri-cores will have the L3 disabled. This makes sense, so, I guess it may be interpreted as good news. Let me explain why:

Caches have been sort of a "loose cannon" in the world of µarchitectures, for instance:

  • The original 266MHz Celerons without any cache were so slow that Pentium MMX 233 were noticeable faster,
  • then Intel solved the problem a bit overkill and launched the cheap and very overclocking-friendly Celeron 300A that became famous because its half-size, full core speed L2 cache made it faster than the much more expensive Pentium II's with double size, off-die, half core speed L2 caches, especially while overclocked allowing 100 MHz memory rather than 66MHz (I owned a Celeron 300A for years, it ran at 450MHz with 100Mhz bus without a hitch and outperformed Pentium II and Katmai Pentium III of the same speed).
  • the problem that killed the hyperthreading feature of top of the line Netburst processors was the cache contention, despite the large sizes of Netburst caches (they were that sensitive to cache misses),
  • one the reasons for the superiority of AMD's Durons (in their price/value space) was their supersized L1 caches,
  • and one of the great reasons why AMD's K8 could compete with Intel processors of FOUR times the total amount of L2 cache was the very efficient "exclusive" architecture of L1/L2 caches (here exclusive means that the data in L2 is not "repeated" in L1)
so, I can understand that the L3 cache in K10 could have been a good idea in the designing stages, but the test in real life conditions demonstrated that the extra memory latency and higher manufacturing costs wasn't really compensated by how much it helped performance. Still, AMD expended lots of money, opportunity costs, time to market, and risk exposure to bugs to develop this feature in K10 that ultimately was proven of dubious value. This highlights, once again, that AMD shouldn't have skipped the intermediate steps between K8 and the "triple challenge", or that the "business exploration" is very important.

Another positive lesson about the Erratum 298 is how much more responsive the Open Source software is when compared to proprietary offerings. Linux already has a patch that emulates the "Accessed" and "Dirty" bits of page descriptors, so, the performance penalty gets reduced to much more numerous page fault exceptions; on the other hand, Microsoft isn't even bothering to patch around the K10 problem; it is true that the patch performs nothing short of "major surgery" in memory subsystem of the Linux kernel, but while AMD can actually make a patch for Linux, I guess that it is unthinkable for Microsoft something as radical. For the same reason, I expect the Open Source virtualization projects Xen and VirtualBox to be much more agile than, let's say, VMWare, to tend a helping hand to AMD to still allow early K10 to run virtualization without an extreme performance hit.

I received an anonymous comment that pointed to "andikleen"'s comment that leads to the code in x86-64.org of the patch and the explanation. [ Thanks to whoever posted the comment, but please, leave a name, there is no need to sign in to anything, just overwrite "anonymous" with a name of your choosing and that'll do ]. Cyril Kowaliski @ TechReport also comments on the bug and the Linux patch finishing with a very important thing, the apparent contradiction that AMD says that few customers will be affected by this problem, but at the same time it strongly advises Phenom motherboard manufacturers to enable the BIOS fix that zaps at least 10% performance without giving the option to disable the fix. By now the whole world knows that the bug is severe, I honestly don't understand what is AMD trying to do by insisting on minimizing it...

According to the Kubicki's article we have been talking about, AMD will continue to ship defective processors until the next stepping, B3, of both Phenom and Barcelona, gets launched in March... although the "2.6 GHz Phenom model 9900 is not affected", so, presumably, the Phenom 9900 would be the first B3 K10.

There are more K10-related news: AMD is re-emphasizing 65nm K8, "Brisbane" [ DailyTech ], "of course!" is what I say. There never actually was any need for AMD to forget about K8, the world is barely moving to the dual core wave and AMD should have focused on improving their dual core offerings rather than the "triple challenge" foolish adventure that led to slower processors than what is acceptable, hotter, and buggier too. K8, on the other hand, still actually has untapped potential. Unfortunately, since AMD has such bad 65nm process, it just can't go for the 3.0 GHz and 3.2 GHz speeds, currently manufactured processors at that speed are all 90nm and will be discontinued.

In any case, AMD will have to ride three more months and more on the back of the architecture it has been slighting for over a year now, K8...

Tuesday, December 04, 2007

A-TItanic

This article is dedicated to Henri Richard, and to a lesser extent, to Dave Orton, because they have been the top rats that got a fortune from AMD and have abandoned the ATItanic(1) ship before facing the consequences of their decisions.

Dave OrtonMr. Orton was the CEO of ATI who managed to sell to AMD at 20% market premiums that nest of lice. Mr. Orton didn't allow much time for the evidently catastrophic situation AMD was in to hurt him, he resigned in July of this year.

The inertia that Mr. Orton's ATI carried over to AMD was the disappointing and power hungry 2900 series. It is also interesting to note that although AMD acquired ATI primarily to guarantee support for its processor initiatives, once inside AMD, ATI hasn't done even trivial things like supporting Quad FX. Now that the company has announced that it is killing the Quad FX, we can be sure that there will never be any ATI chipsets for it, the existing nVidia will be the only one...

Henri Richard holding a pair of Quad FX processorsSpeaking about Quad FX, that foolish platform (2) must have been the invention of someone at AMD, very probably Mr. Henri Richard, Senior Vicepresident of Marketing. Since only now that Mr. Richard is not in AMD the company terminates this initiative, it makes you think whether it was his project.

There are other decisions. Most managers at the upper echelons must have pushed for the disastrous "triple challenge" (3). Someone, very probably from marketing, thought that it was a good idea, one year ago, the last quarter AMD had strong demand, to give preference to OEMs and probably specifically Dell, rather than the historic businesses in the Channel, beginning the string of net losses of about $4 per share. Someone must have thought that ATI, a company with few tangible assets beyond its technical and marketing skill, was fairly valued. I don't think that production-minded people like Mr. Meyer or Dr. Ruiz would have taken the decision to acquire ATI without assessments of people from marketing. And finally, someone must have decided that the lesser evil was to launch defective products than to postpone the launching of any K10 altogether.

Henri RichardMr. Richard departed AMD within two weeks of the very late (paper) launching of Barcelona, the first K10 based processor. Exactly what happened with K10?:

  1. The company is not able to produce them even in small quantities, not even enough for IBM to certify its benchmark.
  2. Not able to produce processors of even mediocre speeds, nothing close to what it desperately need, even slower than the already reduced estimates.
  3. The company launched defective Server and Desktop K10 knowingly.
  4. Promises of quantities or debugged products for late Q1
But all of the above is not really important, I heard/read Dr. Ruiz at least in two occasions saying that Barcelona wasn't going to materially affect the financial numbers of the company this year, that the production this year was for "design wins"; but the most important thing we now know about K10 is that it sucks, big time.

I find the chosen timing of Mr. Richard's departure as very good, he left just before the shit hit the fan.

Guys, I have to thank you. Even though you fooled me several times, in the end your departure allowed me to read through your bullshit and thus you indirectly helped me to hold on to my very bearish portfolio bias on AMD with conviction. That has paid off handsomely.

(1) A fellow "cass"posted a comment here where he refers to AMD/ATI as ATItanic
(2) Quad FX is a very foolish platform without coprocessors: A regular "desktop" computer finds very few uses for more than two cores; the economy/speed of unbuffered DDR2 at 800 MHz memory instead of registered DDR2 667 MHz of Opterons does not compensate for the platform premium cost nor the lack of options; the Quad FX, without consumer-oriented coprocessors, always was an expensive platform for cheap memory, an oxymoron.
(3) Triple challenge refers to develop a new architecture, K10, in an immature process and with the complexity, yields and bin split problems of single die quadcores; all at the same time, and unnecessarily.

More terrible news about Phenom and K10

[ UPDATED 12/5/12:04 CST ]

More lies come from AMD, and the performance of their products is significantly worse than in early reviews:

Giant in a comment at Roborat64's blog mentioned something important:

TechReport reports that their original Phenom benchmarks were done incorrectly, that the actual performance is worse than reported, you may find it at the penultimate paragraph:

We don't yet have a BIOS with the [ L3 Cache ] workaround to test, but we've already discovered that our Phenom review overstates the performance of the 2.3GHz Phenom. We tested at a 2.3GHz core clock with a 2.0GHz north bridge clock, because AMD told us those speeds were representative of the Phenom 9600. Our production samples of the Phenom 9500 and 9600, however, have north bridge clocks of 1.8GHz. Because the L3 cache runs at the speed of the north bridge, this clock plays a noteworthy role in overall Phenom performance. We've already confirmed lower scores in some benchmarks.
It is reasonable to assume that other sites may have done the same mistake, so, the already bad Phenom reviews are actually worse...

But that is not the important thing I want to talk about, it happens that "techreport" almost confirms that AMD lied regarding the reason why it couldn't launch Phenoms at 2.4 GHz and faster, that supposedly only affected the 2.4 GHz and over; it turns out that the problem is pervasive to all the current K10 incarnations, from Barcelona to Phenom, this is what Scott Wasson at "techreport" said about this subject yesterday: "Apparently contradicting prior AMD statements on the matter, [Michael Saucier, Desktop product Marketing Manager at AMD,] flatly denied any relationship between the TLB erratum and chip clock frequencies".

Not just this, but since the bug first showed up at Barcelona (that eventually led to a drastic cut of supply of defective product, only to those who have usage patterns such as supercomping not likely to trip the bug as opposed to virtualization workloads that are likely to trip it), AMD should have expected Phenom to have the same problem, but instead of postponing the launching of Phenom, the company went ahead and launched a defective series of processors.

Scott Wason connects the dots and mentions this:
[T]he presence of the TLB erratum may explain the odd behavior of AMD's PR team during the lead-up to the Phenom launch, as I described in my recent blog post. The decision to use 2.6GHz parts and to require the press to test in a controlled environment makes more sense in this context
It turns out that the BIOS patch that prevents the problem, that also includes microcode updates, turns off functionality of the L3 cache with an official impact of 10% of performance, or 20% according to early independent reviews. Let's use the official 10%, if we simply reduce clock speeds by 10%, the products AMD launched were not faster than 2.1 GHz... But this patch is not available today for the majority of 790FX platforms!

There is also the rumors that AMD will launch triple cores without L3 cache. This would confirm my appreciation that the L3 cache provides dubious performance advantages, but as you may see, it is another point of failure in the development of the architecture.

In summary:
  1. K10 is buggy, in accordance to the predictions regarding the "triple challenge" of developing a new architecture on immature process and managing the complexities of single die quadcores [ I wrote an old article about why I expected that the sheer complexity of "Core" was going to be too much for Intel, but it happened that it was the "triple challenge" complexity what is too much for AMD, just like "Intel's 65nm is Marketing" applies much better to AMD ].
  2. AMD lied about why it couldn't launch Phenoms at 2.4 GHz and over (2.6 GHz were promised a long time ago), this is more of the same bullshit as saying that the L2 latencies in Brisbane were higher supposedly to allow for larger caches in the future (the caches didn't increase in the 12 months after the launch of Brisbane, by the way).
  3. Knowingly, AMD launched defective products
  4. The performance reviews of Phenoms must be revised downward significantly, once the actual bugfixes are availabe, which make take a while!
  5. AMD influenced reviewers to make a mistake (to set the external clock to 2.0 GHz) that would show Phenom in a more positive light
  6. AMD tried to hide the problem at the launching of Spider.
I felt the need to update this post because I didn't speak about the implications:
Traditionally this is the most important season for businesses like AMD, but the products in the market may even be recalled and it will take some time, in the order of months, for AMD to be able to correct the problems, we are talking of late Q1, the worst business season...

I just wrote in "A-TItanic" that Dr. Ruiz several times said that K10 will not affect the finances of the company this year, that early production was for "design wins", but if something is very evident is that even bug-free K10 stink, so who is going to wait more months to buy such mediocre products?, or what motherboard designer is going to bother with K10 features like HyperTransport 3 and such?, what is AMD going to do with the K10 products it already manufactured? What about Penryn? This is all too much to ask to the venerable K8 architecture.

After all, it seems that K10 will affect the finances of AMD this year: very negatively.

Saturday, December 01, 2007

Another margin call!

It is getting annoying the margin maintenance rules of eTrade: I have an investment system for AMD, that believe it or not, every time I make lots of money, I get a margin call!

It has to do with the margin rules. In eTrade, as many other brokerages, the shares you own may be used as collateral for the margin loan, that is, the guarantee you give of being able to pay your margin balance are the shares themselves. Nevertheless, options can not be used as collateral, then, if for some reason you gain lots of money on options while you lose comparatively less on shares, you may lose "margin equity" and get a "margin call". This has happened to me so often that I lost track of the number of times.

My investment system is not very easy to describe, so be prepared to read this several times: I write short term at-the-money covered calls, meaning that I sell calls of strike prices close to the current stock price to expire in a few weeks backed up by shares, but in reality, I use the covered calls as a hedge and money flow to pay for the real investment, quantities of very long term and far-out-of-the-money puts. Since I gain either way with the shares moving up or down (or even more when they move sideways, yeah, I am like the casino: "The House Always Win") I go full tilt with margin purchases, and when I say "full tilt" I mean that I really buy everything I can buy on margin [ note: there are a number of adjustments that I do that may imply acquiring calls or other complicated plays against the market or bullish the market, but the bottom line are the written covered calls with shares overprotected ].

Anyway, when AMD crashes, as it happened recently, those puts really appreciate, much more than what I lose on the shares (the gains on the written calls are not significant in this case); but since the puts do not count towards margin equity, I get a margin call, and typically it forces me to sell a chunk of long term puts, which pisses me off 'cos I have to make sure that the order gets executed the margin call day and so I get hit with the bid/ask spread that may be huge in highly volatile environments...

Fortunately, this time I really wanted to reduce my AMD positions because half the downward movement I expected to happen in 2007 already happened, so, the next half is not so clear: I am not so sure that AMD will go below $7 now that it is @$10, not with the same conviction I had when it was at $13.50 that it will go below $10. I could, in principle, "neutralize" the position to not speculate on whether AMD will appreciate or not with the intention to just "milk" the written calls at over 1% per month of total gains including margin interest and all; but I think I can give my money better use. All in all, I sold 1/4 of my AMD positions and I am now 1/2 as bearish as I was before.

Tuesday, November 27, 2007

How Foolish of Me

I have been thinking for a while about writing an article about my mistakes and the learning process that made me turn 180 degrees. I hope my personal experience may help other investors. I have written about my screwups several times, especially in the recent article "Exploration", but this subject has not been covered comprehensively. Anyway, I was a bit too ashamed to really get done with it, waiting in hopes to get a bit of vindication from the market, which happened recently. Also, somebody posted a comment for my last post pointing out some of my mistakes, the author actually did some research of old posts of mine, and his help giving me a cue has been "the straw that broke the camel's back".

I began as an "staunch" AMD supporter and wrote profusely about the chances this company had to displace Intel, both in technology and finances. Nevertheless, Eventually I renounced my own opinions, after losing lots of money and having several changes of mind. I finally settled at pretty much the opposite opinion, that AMD is collapsing, although I have been very consistent about AMD's deterioration this whole 12 months. I also anticipated the market with my investments, which recently led to significant gains, so, I got half vindication, perhaps I am not a total fool after all.

Still, the most expensive mistake in my whole life has been to be dismissive of the Core µarchitecture from late February to October of 2006. You may go back to my posts explaining why I made the mistake, but from July of 2006 on, the process of discovering how wrong I was is very well chronicled in this blog and you may find a good summary in "Exploration". There is one thing I have been half-right about Core, though: I said that Intel will be relying solely on shrinks to improve speeds and performance, which has been half-true, it hasn't been just shrinks, but overall silicon process improvement, and a new µarchitecture with substantially different features like point to point interprocessor communication and integrated memory controller is approaching. Anyway, I think this subject has been well covered in the blog, so, I am going to concentrate on other aspects:

The fellow commenter quoted from "Third Factory" (late May of 2006) this list of reasons for "AMD to get to over 50% market share":

  1. Good, proven management
  2. Good, proven, exciting technology
  3. The brand name has appreciated significantly, creating a loyal customer base
  4. Technological leadership assures media exposure, free good publicity, public awareness, toghether with the enthusiasm of partners in the ecosystem
  5. The ecosystem itself
  6. Production Capacity!
I was right on several of these items, or at least there weren't too many reasons to think otherwise:

Was AMD's Management good, up to May of 2006? -- Yes!. For the most part, the people in high positions at AMD are either those who saved the company and succeed big time with Hammer or their chosen successors. Hammer was quite a gamble, full of technical and market very hard challenges at which AMD triumphed. With the good work this people inherited, they broke barrier after barrier, taking the company to a position of undisputed leadership. But, as I explained in "Exploration" and many other posts, they later demonstrated to be clueless about how to capitalize on the leadership AMD attained and overstretched the company's resources for tragic consequences. In late May 2006 there weren't many indications of the catastrophic decisions to come, if anything, only the dull launching of AM2 and the insidious denial of the Core µarchitecture threat. Yet, as I explain in "Exploration", I had every intellectual tool to suspect that AMD's management could fail miserably with AMD in a position of leadership, that was terrain unfamiliar to Management, thus, on top of making a huge mistake, I missed my unique opportunities to realize it before it was too late.

AMD still has the exciting technology. You only have to read "Cove3" posts in the Investor Village AMD message board to get convinced, Cove3's unfailing hopes on an eventual AMD superiority still have a bit of chance due precisely to the design superiority.

How is it that AMD still holds to a larger than historic market portion, especially now that every single AMD processor is very far from Intel offerings, that Intel is providing better value, and the new processors are as mediocre or worse than the old generation?: AMD enjoys the inertia from the massive momentum it generated the days of K8's rule, the brand awareness expansion, the opening of business with all major OEMs, etc.

How was I supposed to predict that AMD's management was going to kill the "Virtual Gorilla"? I am still puzzled by the irrationality of the ATI acquisition!

Regarding production capacity, at the time AMD had a credible schedule of capacity growth and having production sold out, even after Core-based processors were launched.

So, I wasn't quite wrong from the point of view of my analysis of the possibilities of AMD, just that Management did very weird things with those possibilities. Conversely, what I identified as Intel mistakes turned out to be Grand Slams because AMD made turds of its possibilities, I still don't think that I was all that wrong regarding Intel. That was yesterday, things now are totally different, Intel rules and AMD tries to follow.

Also from "Third Factory", I was quoted this:
Confirming that AMD is closer to the bleeding edge of technology than Intel, there is the potential of things such as Z-Ram (Zero Capacitor RAM with 5 times higher densities than regular six-transistor flip-flops). I don't know about the introduction rates for semiconductor technology, but it seems that very soon, like in the scale of one or two years, AMD will be able to include Z-Ram either as L2 or L3 cache memory. All that separates AMD from glory (and Intel from demise) is 8 Mb of cache internal to the µ-proc. Z-Ram may be precisely that. Remember that the gigantic L2 caches is the snorkel that prevents the already shit-submerged Intel from drawning, Z-Ram may nullify that...

I screwed this up. But still, the greatest single contributor to Core superiority is the large shared L2 cache. AMD, due to its pathetic incompetence to both evolve the K8 architecture towards shared L2 cache and to manufacture large caches, led to the cache architecture of K10 with its performance hits of inefficiency (half the cache, the four L2s are not shared), small size (2 MB shared L3, 4*512 KB L2 not shared/L1 exclusive, 8*64Kb L1 total for a quadcore), and the introduction of a third level with its associated latencies. I may have adopted the thesis that the shared cache could led to race conditions, but I think I quickly grew out of it, the shared caches are superior, period. Another thing is that I had faith on Z-Ram, only recently I lost confidence on it, the same way I lost confidence on SOI (more about this in a minute).

The commentator wrote the following:
And here's a secret for you - EVEN if AMD implemented high K, immersion litho ADDS NOTHING TO PERFORMANCE - it is merely another technique to print the same size 45nm features (Intel chooses a 2 pass process on the critical steps instead). In the end the feature size is the same. Immersion is just another PR gimmick to make it seem like AMD is ahead, while the scary truth is Intel has far superior litho capabilities as they are able to EXTEND AND REUSE the older dry litho one generation further. Sorprisingly even with a 2 pass process, it is still cheaper than a single pass immersion process as the immersion tools are nearly 2X higher price, run at slower speeds and are less mature than existing dry litho technologies.

I never said that immersion would give AMD any performance edge. It is true that, if anything, immersion gives production costs advantages; but my thesis has always been that AMD should prioritize performance over everything else, then power efficiency and only then production economies. Unfortunately, the company chose, apparently, exactly the inverse order of priorities. Recently I have hypothesized about the reasons: AMD assumed it was going to be capacity-bounded, even after the early warnings regarding the Core µarchitecture, and kept executing plans that assumed extraordinarily strong demand for entry level/mid level performance even in the face of mounting evidence that Vista was a flop, the market was saturating due to Intel's "osbornization" of Netburst and AMD-specific demand deteriorating fast.

AMD's management insistence on prioritizing efficiency over performance always stroke me as "fishy", being production costs easy to misreport and power efficiency an easier goal than absolute performance. Then, when the highly anticipated 65nm launch happened late, in scant quantities, fat (75nm equivalent shrink), slow, worse IPC, and saying bullshit to explain the worse IPC, I finally got conviction for the Bear case.

Until then I had trusted AMD's Management. I had increasingly stronger disagreements with Management regarding strategy issues like not launching the "coprocessor revolution", not being decided to make the Linux market an stronghold, but I always gave them the benefit of the doubt. I took the assumption of the "triple challenge", together with its difficulty and its "unnecessariness" as evidence that AMD's manufacturing capabilities were superb. I couldn't conceive that Management could be so irresponsible as to commit the whole company to this without having guarantees of success, but the bad 65nm process was overwhelming evidence to the contrary. This had staggering implications: If AMD's management could be that irresponsible, there was no reason to suppose they knew what they were doing when they acquired ATI, the single die quadcore couldn't possibly be any good, no new good processor would be launched this year, ATI would hemorrhage its Intel-based market share to nVidia without mitigating increased AMD-based market share, the finances of the company would deteriorate to the point of mortgaging the R&D and capital expenditures necessary to accelerate the development of Fusion and to escalate the investments the 45nm and beyond feature sizes would require. This would become a vicious cycle, an exponentially accelerating decline, and, finally, the kludge-but-practical approach of the double-duals would become a great instrument for Intel domination and a bridge to architectures that would close the design disadvantage. Metaphor: AMD tried to jump ahead of Intel but stumbled and fell, suffering life-threatening polytraumatism.

I was yet to be disappointed once more by AMD's management: the Analyst's Day last year in December. I got fooled because I still thought Management could not lie so shamelessly as to promise all the things they promised. The market was fooled too, the price momentarily shot up, but very soon reality took over.

When AMD's 90nm process was no worse than Intel's 65nm process in early '06, there were reasons to think that the upcoming 65nm process would be vastly superior, and that SOI had inherent advantages over strained bulk silicon. Within that frame, "65nm is Intel Marketing" made sense. The key argument there was that there is a problem of "diminishing returns" for ever smaller feature sizes, and I misidentified Intel's strained silicon as more vulnerable to this problem than SOI. I still would like to know what is so special about AMD's 90nm process that makes it superior to its successor, but given the evidence, it seems that the diminishing returns in SOI are more acute than strained silicon, perhaps the heat dissipation disadvantage of SOI aggravates more than the inherent advantages of smaller feature sizes, and I realized very late the significance of Intel's R&D muscle when applied in full force, the great advances of metal gate/high K transistors.

I am particularly ashamed of "Good time to buy AMD", published in October 19, 2006. That the price actually went up until December is besides the point, that article reflects doubts about the Bear case/optimism, like trusting Management on several topics ranging from the merits of the ATI acquisition to seeing them capable of great initiatives like the "coprocessing revolution" while they already had farted the "Quad FX"; seeing strength of AMD demand while all there was was simple inertia; not understanding that although Intel was in a reactive position, AMD's successes and failures at so many innovations gave Intel a clear view of what to prioritize, just what it needs to allocate its resources efficiently to ultimately bring formidable competition.

I wasn't clueless in the AMD-Intel periphery: Dell crashed, the Dell-AMD deal apparently turned out not so good for AMD, Vista flopped, Sun solved its personality complex, VMWare turned out the sensational IPO (although I was too timid to join in early and tried to milk volatility at the worst moment), Apple is triumphant with its iPhone, Google has gone stratospheric,... I developed a partial model concerning price manipulation that actually boosts my already successful and proven "money pump" system... But I can talk about those subjects with concrete predictions in subsequent posts

Monday, November 26, 2007

AMD's 65nm Process Hurts AMD

In our blogosphere neighborhood Roborat64 addressed the topic of the involution of AMD's product line. I would like to complement his words with a bit of updated concrete evidence.

In "New > OLD" [ Note: I tend to shorten the names of articles if they are properly linked to, this article originally is named "The Fundamental Law of Progess [ sic ] NEW > OLD" ], Roborat64 says: "the fundamental law of progress [...] demands all NEW PRODUCTS [TO] BE BETTER THAN OLD PRODUCTS", but "it is increasingly alarming how AMD seemed to be moving backwards". "The introduction of slower 65nm CPUs and now K10's inferior performance to K8 are just some of the bad habits AMD seemed to be developing".

If we rewind one year to the introduction of 65nm K8, we will see that, from the point of view of the customer, AMD offered worse new products: Processors with some increased L2 cache latencies, that is, an slightly smaller IPC, and clocks not as fast as the fastest 90nm products. At that time, some people thought that AMD emphasized low production costs over higher premiums because AMD was production bounded, especially due to the possibly great increase in demand coming from Dell [ Many other posts contain abundant references as to the catastrophic consequences of this strategy ], and then the optimists assumed that there was no problem with AMD's 65nm process, the company was going to be able to quickly jack up clock speeds in the new process. But that didn't happen. The optimist then insisted on the superior power efficiency of the 65nm, but unfortunately the market cares very little about this parameter and they don't want to accept this fact.

I predicted serious problems with the speeds of the 65nm as soon as I saw AMD's explanation for the increased L2 latencies in 65nm K8: Supposedly, the architecture increased the latencies to leave room for future L2 expansions. Taking into account that the L2 of Brisbane already was half the capacity of the high end 90nm, 512MB versus 1MB per core, and that AMD "could cross that bridge when it got there" for a subject as simple as L2 parameters, there was no other way to interpret AMD's explanations as bullshit, that the company was forced to cut corners because it could not do a good shrink. Taking into account that the 65nm was introduced later and in smaller quantities than expected, I hypothesized that the problems were serious. To confirm this appreciation, AMD has launched ever more disappointing processors, and contrary to any tradition, the products of highest clocks are all old process!.

We must visit AMD's official pages to confirm this highly unusual situation, go to http://products.amd.com/en-us/DesktopCPUFilter.aspx and select both "90nm" and "65nm" in the "Manufacturing Tech (CMOS)" pull down list:

90nm top of the line: 6400+: 3.2 GHz with 2 MB L2, 1.35/1.40 V, 125 Watts.

65nm top of the line: 5200+: 2.7 GHz with 1 MB L2 (half the size), 65 Watts (half consumption, or roughly 61% power consumption accounting for the speed difference, much better, but then again, what does it matter that it consumes less power if things are slower?)

It may be argued that the top of the line in 65nm product is the Phenom 9600, the quadcore @ 2.3 GHz. Too slow, that is what I say. It is so slow, that it has been proven slower in many workloads than AMD's own dual cores... [ link may be found here ]

A very similar thing happens in Opterons:
No 65nm dual core Opteron, measly 2.0 GHz top of the line Quadcore, the 2350 and 8350 [ link ], and a top clock of 3.2 GHz for the 90nm top of the line dual cores, the 8224 SE and 2224 SE [ link ]

AMD's own price lists demonstrate the horrible catastrophe the company is in:

Family Model Process Freq QC/DC Price Price per core
Opteron 2350 65nm 2 QC $389 $97.25

8350 65nm 2 QC $1,019 $254.75

2224 SE 90nm 3.2 DC $873 $436.50

8224 SE 90nm 3.2 DC $2,149 $1,074.50
Athlon 64 X2 6400+ 90nm 3.2 DC $220 $110.00

5200+ 65nm 2.7 DC $125 $62.50
Athlon 64 FX FX 74 90nm 3 DC $300 $150.00


An interesting first observation is that the price per core of 90nm may be more than four times (!!!) the price per core of 65nm processors...

We know that 65nm processors take more than half the die area of 90nm processors because the shrink was to an equivalent node size of over 70nm, we must also expect the yields of the 65nm process to be smaller, and the binsplits to be worse. Furthermore, in the case of quadcores, we also know that the yields and binsplits are quadratically worse than dual cores. From the point of view of production costs, the 65nm are not even twice as cheap, but from the point of view of ASPs, they are way less than half the 90nm, it is clear from the table above that the 65nm is actually hurting AMD! because it is more expensive to manufacture the 65nm product to obtain a smaller selling price.

It may be argued that this is a biased explanation because AMD chose the 65nm process for volume and 90nm for premiums, but this is illogical:
  1. We know that the market is saturated by entry-level crap (Intel Netbursts still being purged, Celerons, and even Yonah/Sossaman Core Duo/Solo incapable of 64 bits), thus the product moves only through steep discounts, thus if AMD would be able to, it would target the 65nm product for the upper segments.
  2. 65nm should have many inherent advantages regarding performance, by insisting on the semi-exhausted 90nm product for top of the line performance, AMD is mortgaging the future of the company.
AMD's words that they already shifted the focus to the 45nm process is a very laughable statement: It has been pretty much demonstrated that there is something serious on the 65nm process that hasn't been solved, why would it be any different at the 45nm process?. Perhaps it is something like that 65nm didn't allow for the same performance of the 90nm process due to inherent limitations, perhaps, the power consumption didn't go down in the same magnitude the heat dissipation problems of SOI went up, I don't know, I am just speculating, but what I do know is that there is a serious problem and AMD is denying its existence.

I feel as if I were beating a dead horse by saying that if the 65nm process didn't scale, it is of very little relevance that AMD will make immersion 45nm if it doesn't have the excellent advancements of High-Dielectric-Constant / Metal Gate transistors Intel is bringing to the market. It is also beating a dead horse to point out that K10 demonstrated that it is a dubious improvement on IPC due to the increased latencies in the L3 cache because Intel comes with Integrated Memory Controller/Point to Point interprocessor communication architectures with enlarged caches and higher clocks!. But what AMD and its cheerleaders are doing, of pegging new hopes on the 45nm process and the dual core K10 to come, is like putting the cart ahead of the horse.

Tuesday, November 20, 2007

Exploration of the Business Space

[ Article Preview ]

In the parallel evolution of AMD and Intel, the advantages of Intel's economies of scale express in several ways that are not very obvious. This post will explain one of those forms that I didn't fully take into account when I was bullish on AMD, and from this principle I extract an important lesson regarding investment in the stock market.

How was it that Intel regained the absolute leadership of the processor market? At the heyday of the Netburst architecture, and with the company absolutely committed to its development, Intel still gave the P6 architecture chances to develop. Not only that, but the company also persevered at the Itanium while quietly it kept developing the strategic plan B of Yamhill, what eventually was christened 'EM64T', the AMD64 clone. So, while officially the future of Intel was the Itanium architecture, and the Netburst architecture represented the required transition; the company still kept going forward with architectures that were their antithesis: the AMD64 ISA capable Netburst and what is much more remarkable, to insist in the apparently exhausted P6 architecture [full line here].

At times I thought that it was a gigantic waste of resources to expend R&D on so many dead ends. I mean, after I studied the Itanium architecture it became clear to me that the thing was going nowhere [ too complicated to implement, too disruptive for the industry, and it had too many 'brain damaged' features to exclude the competition ], and I became a fervent supporter of the original Athlons because it was obvious that Netburst was all marketing gigahertz, but although I disagreed with the directions that Intel took, I still thought they knew what they were doing, that is, I thought that Intel could somehow force the market to become a monopoly of very inferior products the way Microsoft forced the Operating System market to turn into the monopoly of the worst Operating System of its time.

By contrast, AMD's plans seemed crystal clear, and sound: To concentrate 100% energies into an evolutionary approach (AMD64), to capture the best features of the architectural propositions of the time, like the Point to Point interprocessor communication that implied integrated memory controller, SOI versus plain ole' bulk silicon, and so on. That is, I though that although Intel had several times the R&D budget of AMD, since Intel was so wasteful and AMD so focused, the underdog had plenty of chances to topple Intel from its leadership position for good.

But needless to say, I was wrong, and paid a dearly high price for it. My mistake was to overlook that the competitive situation between these two companies is so complex that it was justified to expend R&D money in several contradicting approaches, no matter how improbable their success may seem. I should have known better, because I have worked too much in the field of Artificial Intelligence, thus, of all people, I should've realized that Intel was conducting extremely valuable "search" (exploration) of its business space. That the more contradictory the approaches it followed, the higher the assurance of ultimate success. And sure enough, although they grossly miscalculated their priorities assigning them to Itanium and Netburst (and in that order!), the 'Plan Bs' of Yamhill (EM64T) and P6 saved their skin and allowed them to ultimately restore their leadership and monopoly.

On the other hand, AMD enjoyed immense success at focusing at the right things and following the sound design principles embodied in K7 and K8. It seemed as if the success was going to break Intel's monopoly for good, and anticipating that, I vested my money aggressively into that thesis, and lost acutely.

Now, again, why did I lose?
I explained that I always understood what a great challenge (the triple challenge of "Does AMD...") it was the single die quadcore, and that naïvely I thought that AMD's management knew what they were doing, I couldn't conceive that AMD's management could be so irresponsible as to gamble the company at this huge undertaking if they didn't have guarantees of success. Then, it was clear that they didn't have any guarantees, as a matter of fact, no reason whatsoever to suppose the single die quadcore/new process/new architecture challenge was going to be successful, but it was too late already, two thirds of my money had gone.

Did I have any chance of being right, though?
Not really. I should have followed my convictions to their conclusions. Not only the single die quadcore represented a huge challenge, it was a totally unnecessary one too, and again, I thought that AMD's management knew better than me:

I knew in 2005 that the market will not require full quadcore performance as early as 2007, not even 2009; the reason is very easy to grasp for a seasoned software engineer: The combinatorial complexity of multithreading software. To do software that makes effective use of two threads is more than twice as hard than single threaded, to do software that makes effective use of four threads, is more than 10 times harder than two threads, no exaggeration. I knew that the cores will be severely underutilized in the vast majority of applications, in the vast majority of the market. Since the market won't need full quadcore performance, the logical conclusion was to bet on practical approaches to the quad core issue, that is, multi chip processors. Either double duals, like Intel's, or four cores in a die and a large shared cache at an adjoining die. I also knew that the K8 architecture as it was had bandwidth a plenty and short enough latencies for quadcore processors in Master/Slave configuration of dual core dice. Then, the approach of the single die quadcores wasn't called for at all and that should have been more than enough to erode my confidence in AMD's plans.

But that is not all. Although very late, I saw that AMD was in denial of the threat of Conroe. Even though I heard Henri Richard to speak with deference about Conroe, that somehow AMD's management had internalized that the product was very threatening, that didn't express into any change of plans. AMD went ahead with all sorts of plans that assumed massive market share gains, including negotiating the New York fab, to consolidate flex capacity through Chartered, and to acquire ATI (that wouldn't have made sense if AMD would not be able to compensate lost ATI Intel-based business with extra AMD-based), so, the company greatly overstretched its resources anticipating a great demand that never came; but more than that, it was the absence of a contingency plan, AMD's management never cared to tell us, the people who care and supported the company, what it was going to do if Conroe turned out as good as it seemed; Management only spoke on dismissive terms about the threat. Improvement-less AM2 should have been another late warning.

While it is still a mystery for me why the company went the perilous single die quadcore route for no good reason, we can all be sure that it greatly contributed to the catastrophe the inflexibility of AMD's plans. And this inflexibility is inherent to the scarcity of resources to try experiments, business experiments, to explore the business space. Once AMD chose an strategy, there was nothing else to do but a prayer and to work as hard as possible to make it succeed, mid-term revisions were pointless. If you know that AMD is in "projectile" trajectory, that once it fires all is inertial flight, as I knew it was in April 2006, it was possible to determine whether it has reached a top or not -- it had, because the improvements of the K8 had become smaller and smaller until non-existent. Today, you may readily appreciate that AMD is in free falling, 'cos it fired the K10 and the shot fizzled, from here on, it is nothing but accelerating decline.

I have been referring several times to the discrepancy between my opinions that derived from fundamentals and the strategies decided by the management of both Intel and AMD; You may be wondering how to determine what is the superior analysis, because that is key for successful speculation, and here is how: Decide for the thesis of the most informed.

If you have enough knowledge to reach a conclusion from pure fundamentals, such as "Itanium is not possible to implement satisfactorily", "Single die quadcores are not necessary", then, in the absence of concrete evidence that contradicts conclusions from fundamentals, follow your instincts. I find it fascinating that a pedestrian may understand certain key issues of multi billion companies better than their managers, but it happens all the time. The reason is that there is a disconnect between management and reality. While from the point of view of software engineer you are seeing the real stuff, the real value of an architecture, the CEO, despite all of his privileged information, is not seeing the real thing, but third or fourth level interpretations (the opinion on the opinion on the opinion of someone about something) , and there may be interests to distort his information.

On the other hand, if it is something like "I think the P6 architecture is exhausted, that it is impossible to substantially increase its performance", but at the same time a huge organization like Intel does a 180 degree turn to commit to a P6 derivative architecture developed by a group of people out of the power rings of the organization, you can be sure that their architecture is so fucking good that its objective merits prevailed over the intense politicking of a big organization, that is, your opinion is wrong.

A closer example: When a national investment company throws 600 M$ to AMD when you think the company is doomed, who is right, the Arabs or you?: Answer: Think about "what do the Arabs know about the semiconductor industry?", "are they in the centers of power that decide stock prices, media spin?", and also: "Has this organization explored the semiconductor business?" if you get to the conclusion that it is not likely that such organization has privileged information, you may be right. It would be much better if you also had a model to explain why the organization put the money in AMD.

Sometimes it is a much tougher call. For instance, "will Microsoft succeed at DRM?" I meant whether Microsoft will succeed at becoming the de-facto multimedia distribution channel by temporarily jeopardizing the Operating System monopoly with DRM restrictions that allienate the customers. Microsoft is a company that routinely fails at home-grown initiatives, despite following many different routes or covering all the bases. This is evidence that although Microsoft thoroughly explores its business space, its "business cartography" doesn't make its way to management all the time. Why would that happen? If you watch the "developers" video of Steven Ballmer, you get a hint. And Bill Gates quotations about the internet are a famous example of business myopia -- This could be a problem of management personalities that do not allow information that contradicts them to flow. Looking even closer into Microsoft successes, such as the Internet Explorer in the browser wars, you may see that sometimes, when Microsoft is actually threatened, it does the right things; perhaps only when management is willing to admit being wrong, the company may capitalize its thorough knowledge of its business. The problem is that from the outside it is nearly impossible to know if the company is using the business cartography or not.

Apple Computer is another example: I may have the opinion that Mac OS X should be licensed to run in non-Apple personal computers, but I must be wrong: Steven Jobs surely has much better information about this topic, Mr. Jobs has demonstrated that he learned the lessons of his long penitence after being fired from Apple and losing an arm and a leg in NeXT, and Apple continually demonstrates how well it understand its business space. Thus, I have to keep studying Apple to improve my model of that company.

Going back to AMD and its management, it is clear that those guys don't have a clue about their business space. Once they got out of their traditional "value proposition" space, they made blunder after blunder:

  1. Never capitalized on the enormous significance of AMD64 by continuing to partner with Linux to force Microsoft into promotion of AMD64 for all segments, beginning with mobiles [ what did it take to finance the development of Linux drivers for the few wireless chipsets that ran on AMD-based laptops? AMD could have become the "centrino" of Linux, and this would have induced multiple effects that would have turned AMD64 into premiums ]; this because AMD never knew how to behave like an industry leader when it had the undisputed leadership.
  2. It tried to create the consumer market for coprocessors only too late and too timidly, while having had years upon years of all the elements for coprocessing in place.
  3. It didn't capitalize on the desperate need that platform partners had to support AMD (because Intel is treacherous and tries all sorts of nasty tricks to block competition to get into its platform business). Contrary to all rationality, they went ahead and became the direct competitor of its second best partner (nVidia) and destroyed half the business of its fourth best partner, ATI. I think here that AMD's management were not ready for the big leagues. They were scammed by Wall Street into a disastrous acquisition the way hustlers scam the farm boys when they go to the big city
  4. It showed that they were ignorant of how to deal with big OEMs because in periods of high demand for their products, rather than making lots of profits, the preference to OEMs expressed into the destruction of loyal, historical and natural channel business and losses!, then, they fell in the trap of overproduction instigated by the OEMs to have cheaper products.
  5. Having gambled the company with the "triple challenge".
Business Space Exploration has a cost and an expected benefit. The cost does not increase with company size, the benefit gets multiplied by company size: It is a much better option for large companies than it is for small companies, another form of economies of scale. For a company as small as AMD, and being so expensive to explore the market with several architectures, it is not cost-effective for AMD to explore its business space. But this doesn't mean that small companies don't have a chance: On the contrary, they should have the advantage of having their management closer to reality, that is, a better chance to succeed following best principles. When you see a small company like AMD straying from the best principles (such as going the route of the triple challenge), "short" the stock, statistically you will succeed.

P6 Line:
Pentium Pro, Pentium II - Klamath, Deschutes, Tonga, Dixon, Xeon Drake, Pentium III - Katmai, Coppermine, Tualatin, Xeon - Tanner, Cascades, Pentium M - Banias, Dothan, Yonah/Xeon Sossaman, 'Core' - Conroe, Allendale, Kentsfield, Woodcrest, Clovertown, Tigerton, Harpertown, Merom, Penryn, Yorkfield

Monday, November 19, 2007

Phenom

I said that Barcelona (and Phenom) as a strategy implied absolute confidence to succeed at three important challenges:

  1. Extraordinarily good 65nm silicon process, to try to close the competitiveness gap to Intel
  2. A new architecture developed without the help of a thoroughly understood process
  3. The single-die quadcore feature that implies a hit of yields and especially binsplits
Since it is now very clear that AMD failed at all three, it is easy to find articles in the press that detail the situation, and I would like to compile some of them.

Adrian Kingsley-Hughes writes "AMD's Phenom - For suckers only". There we find a reference to "Tom's Hardware" that demonstrates that the 6400+ X2 (3.2 GHz, 90nm [you may go here and select the 6400+ model to confirm the specifications]) beats the fastest Phenom, the 9600 2.3 GHz at a variety of tests. That is, even today, a year after the launch of 65nm product, it is the 90nm products the ones that have the performance crown of AMD. No wonder that the highest selling price processors [AMD's official price list], both Athlon and Opteron, are all 90nm. Not only the 65nm processors are slow and the company sells them on the cheap, but inexplicably, the 65nm K8 are an involution compared to the 90nm, even taking into account that the company has five and more years of experience with the K8 design. The 65nm is not slightly better, but slightly worse, AMD cut corners like increasing the L2 cache latencies.

Rumors have surfaced regarding bugs in the K10 architecture that have delayed its introduction [example]. AMD has had to develop the silicon process at the same time it tries to debug the new architecture, that's the recipe for low yields, unacceptably slow speeds, and late introduction. There are good improvements in the K10 architecture, but I think the existence of a third cache level could be the culprit of the K10 lukewarm performance: The K8 had a an architecture of two levels that due to the exclusivity of the L1 resembled more the delays of a 1.5 levels, but now, rather than having a large shared L2 cache, as the successful Intel approach demonstrates, the K10 has three levels. Why? my speculation is that a quadcore design requires a large shared cache and I think AMD didn't dare to try this radical change at the L2, so, it came with a third level that offers dubious performance advantages. Do you see? A shared cache is more natural in a single die design, yet, AMD has insisted in a less efficient independent L2 and that forced it to introduce a third level, shared, creating an inefficient three level hybrid.

We also know that AMD will launch triple-cores, speaking volumes about its dramatic failure at the challenge of single-die quadcores. To further illustrate the principles that dictate exponentially worse yields and binsplits in single-die multicores, AMD is launching an overclocking application that allows the user to control individually the speeds of every individual core.

Yet, AMD didn't need to to follow this route at all. It could have developed 65nm not with the intention of increasing numbers but premiums. AMD speaks of "Asset Light" today because it has plenty of production capacity of low performance products of which the market is saturated but doesn't have any high-premium product; perhaps because AMD saw the 65nm process as a tool to minimize production costs and increase production volumes rather than the process for the new generation of flagship products. The emphasis on volume/cheap production costs rather than performance while developing the 65nm process may have its origins on AMD's "pharaonic" plans of having to supply all the OEMs with plenty of product, especially the production capacity increase that Dell meant. Perhaps performance had to wait. But not even that strategy worked: we know what happened when AMD, a year ago, gave preferences to Dell and other OEMs rather than the channel: the destruction of the loyal and natural businesses of AMD.

AMD didn't need a new architecture to market quadcores, it had several MCM options at its disposal. It is clear today that the excellent interprocessor bandwidth of DCA and CCHT would have been more than enough to have processors with two dual core dice in which only one has interface to memory and the other communicates through CCHT. Actually, AMD had every advantage to put quadcores in the market before Intel; but of course, the company decided the strategy of skipping all intermediate steps for the big leap of the single die quadcores and this catastrophe happened.

Also, the company may have opted for a new architecture that actually increased performance. The most pathetic thing of all is that K10 quadcores, even having the large advantages of single dieness integration and point to point (busless) interprocessor communication, is actually SLOWER per clock than the Intel double dual kludge. When you try not to improve performance, but to solve the problems that derive from single-dieness, you may end up with an architecture that is good for nothing.

Adrian Kingsley-Hughes again: "So really, what’s the bottom line? AMD have put time, effort and truck loads of dollars into developing a quad-core processor that really isn’t that good".

Now, let's look at Phenom versus Intel products:
Anandtech (referenced by Kingsley-Hughes): "While AMD just introduced its first 2.2GHz and 2.3GHz quad-core CPUs today, Intel previewed its first 3.2GHz quad-core chips", "Intel is currently faster and offers better overall price-performance (does anyone else feel weird reading that?). Honestly the only reason we can see to purchase a Phenom is if you currently own a Socket-AM2 motherboard".

Tarinder Sandhu @ HEXUS.net (also referenced by Kingsley-Hughes): "
We've disseminated all the various enhancements that make Phenom a better clock-for-clock proposition than Athlon 64 X2. We've identified that the design is elegant[.] But what we've also seen is that AMD cannot match the clock-speed of Intel's slowest quad-core processor and, worse still, can't match Core2 Quad's performance on a clock-for-clock basis either." and "the Spider platform - where AMD tries to harness the innate synergies of its processors, chipsets and GPUs - can be bettered by a mix-and-match assortment of Intel and NVIDIA" -- so much for the "synergies" of the ATI acquisition, over a year later, there is unanimous agreement that Intel+nVidia is superior.

But the real problem is not that Phenom (and Barcelona) stinks: Intel is busy finishing a new busless platform and architecture. AMD demonstrated that most of DCA and CCHT are overkill for the vast majority of computers, that is, Intel doesn't have to do something as fancy as DCA and CCHT, but something just good enough to get rid of the most important limitation that their architecture has: The front side bus. Also, from the "pure muscle" point of view, Intel is already putting processors with metal gates and high-k dielectrics (that is, much more efficient and faster) while AMD is still trying to catch up to its own 90nm!; so, the competitive position of AMD will become worse, much worse!

Jason Cross @ Extremetech (also ref. by Kingsley-Hughes) mentions this in passing: "As Yorkfield CPUs mature and come down in price, the only way for these Phenom chips to compete will be to offer much better clock speeds without blowing out the power envelope, and they won't get there on 65nm. AMD simply can't afford to let Intel's Nehalem redesign hit the market, mature, and come down in price before bringing out a wide range of 45nm Phenom CPUs."

I think that nobody should be talking about AMD's 45nm products when the company is still flunking repeatedly the 65nm grade.

Anyway, regarding AMD's future, things are turning hopeless: It can't get rid of the fabs it owns due to either x86 licensing restrictions or contractual obligations with Dresden, thus, no "Asset Light", but expensive Fabs that will be less and less competitive to what Intel is doing. nVidia keeps getting ahead relentlessly in the graphics business while ATI is still shedding Intel-based market share and no synergies in sight. AMD can only move product on price, but it is already enduring losses of nightmare; all while Intel comes with vastly more competitive products and AMD already used up its best ammunition...