Early Access to the Intel® Itanium® Processor Keeps CoSort® on the Cutting Edge
Introduction
A move to developing for Itanium®-based systems and Intel® Extended Memory 64 Technology (Intel® EM64T) can liberate a programmer. 64-bit computing is a great technical fix for many of the design challenges that face typical applications, particularly those having to do with memory limits. As an added benefit, most software that is code-clean for the Intel® Itanium® Processor Family also runs smoothly on systems that support Intel EM64T with just a recompile.
There's a cost to those benefits, of course. At the very least, you need to have a clear understanding of the "memory model" implicit in application development, and the consequences of the move from 32 to 64 bits. To take proper advantage of them, you'll run into the same old "housekeeping" issues that have been around for years. Building 64-bit software so it is as zippy as it should be is like any other coding project: you still need to manage memory carefully.
It may come as a surprise to many developers that sensitivity to low-level memory housekeeping remains crucial for effective programming in the 64-bit environment, but it is nevertheless true.
What We Expect from 64-Bit Computing
Computing on Itanium-based systems and Intel EM64T promises to eliminate immediately several of the bottlenecks that now frustrate application developers, including the following:
Paging overhead
File-size limits
Input/Output (I/O) constraints
Access to 64 bits of addressing space opens a vastly greater range of in-memory implementations. Programs have a far greater opportunity to manage large datasets as in-memory logical objects, rather than on-disk images which must be swapped in and out. This capability improves both performance and availability.
32-bit computing constrains file sizes to no more than two gigabytes. That limits many applications in active use today, especially those in the areas of database management, video processing, application service and a range of enterprise software. 64-bit operating systems, however, manage file systems with files larger than most other hardware can yet support.
64-bit platforms are ready to overcome those limitations, today. When programmers move to 64-bit computing, they leave behind whole categories of memory-related problems. That liberation, however, does not eliminate the need for developers to attend to memory issues. There are two reasons developers still must think carefully about memory. The first has to do with the basics of programming memory models.
Memory Basics Still Matter
Memory is never larger than the capacity to mismanage it. While 64-bit computing pushes back memory limits, it still pays to respect memory-management guidelines. The languages in which nearly all applications are developed share several aspects of their "memory model" or semantics. They all have a notion of "variable". Variables label elements of data storage. Memory access imposes just a handful of abstract rules:
When you assign a value to data storage, the storage element must be proper.
When you retrieve a value from data storage, the storage element must be proper.
Your language has at least one way to request new data storage elements, but also a limit on their total capacity.You need to be careful not to exceed that limit.
Consider first how a language like C embodies these abstract rules in concrete coding patterns. C defines named variables, and also allows direct access to memory. The main difficulty with C's named variables with regard to memory access has to do with array types; programmers frequently make such mistakes as the following:
The error is that '14' names a memory element that hasn't been defined; the definition of my_array included only ten elements. This is generally called an array-bounds error; the source code expresses a nonsensical request to write a value to a location that doesn't exist.
C's ability to access memory directly permits such mistakes as the following:
The forgoing is an error because my_pointer is uninitialized; in particular, there's no guarantee that it points to a "safe" data element area.
There are complementary restrictions and violations for memory retrieval in C. For instance, the following code is a mistake.
C has several distinct ways to request data storage elements. The one most likely to lead to problems has to do with the malloc() memory-management Application Programming Interface (API). Here's a model for malloc usage: While this is a common construction, it is also in error. It does not account for the possibility that malloc may have 'run out of memory', in which case it returns NULL. Assignment to *NULL leads to undefined results.
These basics aren't specific to 64-bit computing. Doesn't 64-bit computing mean it's time to move on to more important concepts?
Security Is Paramount
Memory management is the important concept that demands our attention. While the simple material above, or its translations into other languages, looks familiar to essentially all programmers, our profession still doesn't understand it deeply. More precisely, massive evidence exists that the source code of existing applications densely embeds instances of these simple errors.
Organizations use 64-bit systems to perform important calculations: miss ion-critical database transactions, demanding graphics computations, and so on. These investments are substantial, and it is worth a bit of engineering time and a few infrastructure tools to assure that the results of those calculations are accurate.
Security implications only reinforce all these conclusions. Memory-coding errors have been known for at least twenty years to lead directly to security vulnerabilities. Despite this fact, and despite the apparent conceptual simplicity of memory management, memory-access miscodings appear to be responsible for the majority of breaches that viruses and related security attacks exploit. The security status of software as a whole is disastrous. Moreover, for better or worse, deep security analysis is only possible once basic memory access is sanitized. Any memory-access violations can reduce the effectiveness of other security measures all the way to zero.
As a profession, we still have too many of these uncorrected memory mistakes in our applications. The good news is that it's possible to get memory access right. It's not easy, or we would have done it already. With discipline, though, it is manageable.
Categories of Correction
There are three means to get memory access right:
'Safer' languages and safer libraries
Inspection
Special-purpose memory-testing tools
Java* is a safer language than C, in particular in terms of memory management. Java has more restricted semantics than C; it's more difficult to 'shoot yourself in the foot' with Java, because it simply doesn't express such mistakes as the uninitialized memory pointer above. At the same time, it is important to understand that these distinctions are relative. Many C-coded programs "leak memory" because of code fragments such as this:
Java isn't subject to this kind of memory leak. However, Java-coded programs still can leak memory at the application level. Think, for example, of a Web server coded in Java. It has a cache mechanism. If there's no provision for rotating or bounding the cache, it will eventually fill all of the physical memory. While the memory might not have "leaked" in a formal sense, an inadequate design leads to the unsustainable result that the server can make no new computations. This highlights the reality that all programmers, not just those working in C or C++, need to be sensitive to memory models and their proper engineering.
Many organizations that put a premium on performance wrongly assume they should use low-level languages for 64-bit applications. As previous articles in this series have emphasized, it's far more rewarding to use high-level languages, achieve correct and reliable results quicker, and invest in hardware and algorithm development. Reliance on low-level languages for large-scale development with diffuse performance goals simply does not give the application-scale speed that is widely believed to be typical for "fast" languages.
On the other hand, problems with memory access, and especially memory security, are quite typical of code built with low-level languages. Use high-level languages. Use domain-specific languages. If your applications fail to meet performance requirements, address that as a specific, local challenge, with the help of experts who specialize in performance achievements. (Membership in the Intel Software Partner Program provides information and assistance optimizing and porting applications for Intel's latest processors.)
If you are using a language such as C or C++, you can prevent many common memory problems by substitution of higher-level libraries for more primitive memory approaches. 'Garbage-collected C' eliminates one entire sector of memory errors. Use of the Standard Template Library (STL) or other powerful C++ class libraries minimizes source code line count and therefore reduces opportunities for mistakes.
Perhaps the single most effective way to improve the cleanliness of memory coding is inspection. I use the word here in a broad sense, including everything from careful bench testing during early design stages, through the introspective collaboration Extreme Programming counsels for coding, to source-code engineering review. It's not just that academic research has demonstrated that inspection is less labor-intensive than discovery of problems in late-stage quality assurance or customer response. Inspection simultaneously has the potential to give programmers practical experience in improved techniques. It's an important investment in an employer's intellectual capital.
Meaty published literature on software inspections is available through Brad Appleton's 'jump page'*. If you must restrict yourself to one item, let it be the book titled Software Inspection*.
Testing Tools
Specialized memory-testing tools are also quite effective at locating faults. In fact, my experience with the range of these is that they're exceptionally rewarding. Every organization with which I've worked to install an automated memory-testing tool has located programming faults it found through no other method. This is noteworthy, if only because I've seen so much software bought that was never even installed successfully, let alone executed to a satisfying outcome.
Why list tools last, then, after language choice and inspection? Because the selection of memory testing tools for 64-bit platforms is still relatively limited. This is not particularly surprising. While there don't appear to be any dramatic technical obstacles to porting these memory-testing products to 64-bit platforms, release of these tools may have to wait until the market for them is larger.
Developers working on 64-bit systems need the help of specific memory-testing tools. Memory tests are different from validation of other technical requirements. Even workers with extensive general testing experience need to study the specifics of memory-management work before they can be productive in this specialized area.
Even with the limited number of memory tools for 64-bit computing available, the developer has two alternatives. The first is to perform memory tests on a supported architecture, such as Windows NT or Unix. The other is to use open-source memory-testing tools on 64-bit systems.
There are several valuable open-source testing tools, generally available through Ben Zorn's* page on 'Debugging Tools for Dynamic Storage Allocation and Memory Management'. While this page is rarely updated, it's adequate as an introduction to the available products. Several open-source products, including mpatrol* and Electric Fence*, install relatively easily and are effective at tracking down such problems in a C-coded context as the following:
Out-of-bounds array errors
Reading from uninitialized or unavailable 'heap' memory
Writing to unallocated memory
Leaks of memory as the addresses of storage elements are mistakenly discarded
Several other languages also support memory extensions or commands that help track down the memory faults characteristic of those languages. Tcl, for example, has a [memory] command to help debug memory allocation problems. Keep in mind that memory management isn't a 'sexy' topic in most programming circles. To learn more about the memory capabilities of these products, it's generally necessary to consult with specialists in the languages that interest you. Few 'advocacy' or 'language profile' pages provide specific technical details on memory hygiene.
The Importance of Porting
If you don't yet have the habit of verifying that your programs handle memory properly, the move to 64-bit computing makes it urgent that you learn it. There are several ways to start, involving everything from personal study to the purchase of professional-quality products and training; at least one of these should fit both you and the organization for which you work.
The security (or insecurity) of computer systems has recently become important enough to be a featured topic in the popular press. In many, perhaps most, cases of security faults known to the public, the specific technical fault turns out to be a form of memory mismanagement. Programmers have had a rather sorry record in the past of delivering applications that leak and overwrite memory, often with grave security consequences. You can do better. We might soon enter an era where we're required to do better, at least with regard to well-known memory issues.
Attention to memory issues arises in a second way in the 64-bit world. A naive port to a 64-bit platform simply carries source code over, re-compiles and re-links it as necessary, and uses the executables that are verbatim translations of their 32-bit counterparts. This should, in general, give correct functionality. It fails to take full advantage of the target architecture, however, and in some cases such a port performs more slowly than its 32-bit relative.
What should you do when porting work to 64-bit systems? Start with the resources in the guide to "Porting to the Itanium Processor Family". Several of these documents deserve your attention; first among these is likely to be "IA-64 Porting Methodology". You can anticipate much of the advice there: use correct source, keep your code base clean, be precise in data declaration and Application Programming Interface (API) invocation, and use a hardware system that's been properly built. Analyze your requirements carefully; there's little advantage to tuning memory lay-out if your application is I/O-bound.
Programs that work properly on 32-bit machines might fail entirely when ported naively to Itanium processors or Intel EM64T. This often perplexes programmers whose experience is limited to a single architecture. Improper source code might yield acceptable results "by accident" on one processor, but go dramatically wrong when the memory model changes even slightly. Don't treat this as a mystery. It's best to rewrite source code to use proper idioms. Clean up your source before you start any hunt for incompatibilities between platforms. Problems are far likelier to result from deprecated coding constructs than true incompatibilities.
In C terms, keeping your source code clean starts with disciplined use of lint and related tools for static analysis. Be sure you're consistent with the semantics of your variables – casts between pointer and numeric types are usually a warning sign of confusion and even error.
Even when your code is correct, a move to an Itanium-based system or Intel EM64T can 'de-tune' it. Data and code segments expand; they're based on 64 rather than 32 bits. If you've carefully optimized your memory lay-out, you might find that your segments no longer fit.
Conclusion
Moving on from what to do with your existing 32-bit code, it should be even easier to code new development right for an Itanium-based system and Intel EM64T. If you continue to write in low-level languages, the beginning of a project is generally the easiest time to get datatyping and architectural issues right. The beginning is also a good time to consider new languages or at least new libraries. It's not just that these can ease your own programming, as the sections above explained for the case of memory issues.
Compilers and other language processors generally attract top talent, so, by writing in a high-level language, you're leveraging their intelligence. A great deal of direct and indirect effort has already gone into making such languages as Java and Perl* efficient with the new processor. When you code in Java or Perl, you know that your end-users will run applications already optimized for Itanium processors at one implementation level. The Performance Optimization Center for the Itanium Processor Family presents the main issues involved in programming new projects for Itanium processors.
Whatever your programming responsibilities, keep the basics in mind: be clear about your requirements, keep designs and implementation clean, put a specific plan in place to ensure correct memory use, take advantage of programming tools that understand 64-bit computing, and keep reading the Intel® Developer Zone for tips on expert development.
Additional Resources
Itanium® Processor Family Performance Advantages: Register Stack Architecture follows a race-track analogy to guide developers through taking advantage of the register stack engine to make programs run faster.
Itanium® Processor Family Developer Center provides resources to help developers derive maximum benefit from the platform's architecture and scalability.
Data Alignment when Migrating to 64-Bit Intel® Architecture - Proper data alignment is important on both the Itanium® Processor Family and on processors that support Intel Extended Memory 64 Technology. Learn to avoid exceptions and costly performance deficits associated with incorrect data alignment.
Porting to 64-Bit Intel® Architecture provides a high-level examination of porting applications to Intel EM64T, including capabilities of the technology, as well as related caveats and limitations.
Intel® Itanium® processor home page covers the latest features, specifications and performance figures for the Itanium processor.