13 Replies Latest reply on Jul 5, 2017 10:18 AM by mr.unknown

    Wikipedia Assembly Language Programming

      I reorganised the table in here, in the hope that it could clear the confusions of beginners, who are started learning assembly language programming and computer organisation. I actually combine the two tables onto one in order to reflects the relationship and complexities between x86 and x86-64 processors. I wish people who are really interested in it provide your positive suggestions. And I thank you in advance!


      Message was edited by: Matt B We have updated the title of this discussion with relevant details to better describe your issue.

        • Re: A table needs evaluating
          GenerationIntroductionProminent CPU modelsAddress SpaceNotable features
          1st1978Intel 8086, Intel 8088(1979)16-bitNA20-bit16-bit ISA, IBM PC (8088), IBM PC/XT (8088)
          1982Intel 80186, Intel 80188
          NEC V20
          8086-2 ISA, embedded (80186/80188)
          2ndIntel 80286 and clones30-bit24-bitprotected mode, IBM PC XT 286, IBM PC AT
          3rd (IA-32)1985Intel 80386, AMD Am386 (1991)32-bit46-bit32-bit32-bit ISA, paging, IBM PS/2
          4th (pipelining, cache)1989Intel 80486
          AMD Am486(1993)/Am5x86(1995)
          pipelining, on-die x87 FPU (486DX), on-die cache
          1993Intel Pentium, Pentium MMX(1996)Superscalar, 64-bit databus, faster FPU, MMX (Pentium MMX), APIC, SMP
          1994NexGen Nx586
          AMD 5k86/K5 (1996)
          Discrete microarchitecture (µ-op translation)
          1995Cyrix Cx5x86
          Cyrix 6x86
          dynamic execution
          (PAE, µ-op translation)
          1995Intel Pentium Pro36-bit (PAE)µ-op translation, conditional move instructions, dynamic execution, speculative execution, 3-way x86 superscalar, superscalar FPU, PAE, on-chip L2 cache
          1997Intel Pentium II, Pentium III (1999)
          Celeron(1998), Xeon(1998)
          on-package (Pentium II) or on-die (Celeron) L2 Cache, SSE (Pentium III), SLOT 1, Socket 370 or SLOT 2 (Xeon)
          1997AMD K6/K6-2(1998)/K6-III(1999)32-bit3DNow!, 3-level cache system (K6-III)
          Enhanced Platform1999AMD Athlon, Athlon XP/MP(2001)
          Duron(2000), Sempron(2004)
          36-bitMMX+, 3DNow!+, double-pumped bus, Slot A or Socket A
          2000Transmeta Crusoe32-bitCMS powered x86 platform processor, VLIW-128 core, on-die memory controller, on-die PCI bridge logic
          Intel Pentium 436-bitSSE2, HTT (Northwood), NetBurst, quad-pumped bus, Trace Cache, Socket 478
          2003Intel Pentium M, Intel Core (2006)µ-op fusion, XD bit (Dothan)
          Transmeta EfficeonCMS 6.0.4, VLIW-256, NX bit, HT
          64-bit Transition
          1999 ~ 2005
          2001Intel Itanium (2001 ~ 2017)52-bit64-bit EPIC architecture, 128-bit VLIW instruction bundle, on-die hardware IA-32 H/W enabling x86 OSes & x86 applications (early generations), software IA-32 EL enabling x86 applications (Itanium 2), Itanium register files are remapped to x86 registers
          64-bit Extended
          since 2003
          2003Athlon 64/FX/X2(2005), Opteron
          Turion 64
          40-bitAMD64, on-die memory controller, HyperTransport, CMP, AMD-V (Athlon 64 Orleans), Socket 754/939/940 or AM2
          2004Pentium 4 (Prescott)
          Celeron D, Pentium D (2005)
          36-bitEM64T, SSE3, 2nd gen. NetBurst Pipelining, Dual-Core (Pentium D), Intel VT(Pentium 4 6x2), socket LGA 775
          2006Intel Core 2Intel 64 (<<== EM64T), SSSE3(65nm), SSE4.1(45nm), wide dynamic execution, µ-op fusion, macro-µ-op fusion, Smart Shared L2 Cache
          2007AMD Phenom/II(2008)
          Athlon II(2009), Turion II(2009)
          48-bitMonolithic quad-core, SSE4a, Rapid Virtualization Indexing (RVI), HyperTransport 3, AM2+ or AM3
          2008Intel Atom36-bitnetbook or low power smart device processor, P54C core reused
          Intel Core i7
          Core i5
          (2009), Core i3 (2010)
          QuickPath, on-chip GMCH (Clarkdale), SSE4.2, Extended Page Tables (EPT), LGA 1366 (Nehalem) or LGA 1156 socket
          VIA Nanohardware-based encryption; adaptive power management
          2010AMD FX48-bitocta-core, CMT(Clustered Multi-Thread), FMA, OpenCL, AM3+
          2011AMD APU A and E Series (Llano)40-biton-die GPGPU, PCI Express 2.0, Socket FM1
          AMD APU C, E and Z Series (Bobcat)36-bitlow power smart device APU
          Intel Core i3, Core i5 and Core i7
          (Sandy Bridge/Ivy Bridge)
          Internal Ring connection, LGA 1155 socket.
          2012AMD APU A Series (Bulldozer, Trinity and later)48-bitAVX, Bulldozer based APU, Socket FM2 or Socket FM2+
          Intel Xeon Phi (Knights Corner)48-bitcoprocessor OS powered PCI-E Card Formed coprocessor for XEON based system, Many Core Chip, In-order P54C, very wide VPU (512-bit SSE), LRBni instructions (8× 64-bit)
          2013AMD Jaguar
          (Athlon, Sempron)
          48-bitSoC, game console and low power smart device processor
          Intel Silvermont
          (Atom, Celeron, Pentium)
          36-bitSoC, low/ultra-low power smart device processor
          Intel Core i3, Core i5 and Core i7 (Haswell/Broadwell)39-bitAVX2, FMA3, TSX, BMI1, and BMI2 instructions, LGA 1150 socket
          2015Intel Broadwell-U
          (Intel Core i3, Core i5, Core i7, Core M, Pentium, Celeron)
          SoC, on-chip Broadwell-U PCH-LP (Multi-chip module)
          2015/2016Intel Skylake/Kaby Lake/Cannonlake
          (Intel Core i3, Core i5, Core i7)
          2016Intel Xeon Phi (Knights Landing)48-bitBootable and standalone accelerator supplement to Xeon system, Airmont (Atom) core based
          2016AMD Bristol Ridge
          (AMD (Pro) A6/A8/A10/A12)
          48-bitIntegrated FCH on die, SoC, AM4 socket
          2017AMD Ryzen SeriesSMT
          EraReleaseCPU modelsPhysical Address SpaceNew features
          • Re: A table needs evaluating

            I spent years making corrections of errors found on Wikipedia, I was kicked and humiliated by those so-called experts there, such like this and so many others. Those wikipedians might get profits from some companies or organisations. My corrections just make them fear the loss to gain such things, such they are content to wasting time and energy to kick me out there. Their edits on the Wikipedia are very unfair to AMD64 architecture, which I love since the very beginning. I devote and contribute most of my energies to it, and grasp any chance to make the correction. And I find it worthy doing that for my love of AMD64.

            • Re: A table needs evaluating

              Athlon is the 7th generation of AMD x86 processor, when I was in high school, so many classmates customised their computers based on Athlon, for its comparably low price, high frequency (FSB), overclock capabilities (using a pencil could simply unlock the fix ratio limitation), and we could never deny that we love VIA products, best chipsets with best processors once with the glory fulfil that story. But then I stick to my Pentium III machine, and its performance did not see any worse than Athlon, even with Pentium III Tualatin introduced, I did not find any reason to change my computer with AMD core, but with simply an upgrade, from Pentium III to Pentium III Tualatin with a PowerLeap transformer, a very famous product then. At the same frequency, it could beat the Pentium 4 easily, without needing to say a word around Athlon. Comparing the processors in that generation, one could hardly find a very difference to sort it as another generation. So I eventually put that so-called 7th generation of x86 processors as Enhanced Platform. The most words people then talked about is the 64-bit computing and the phase-out of traditional x86. So no matter Athlon or Pentium 4, one could see the manufacturers actually did is to enhance the existing platform to make preparation for the then future 64-bit computing.


              Athlon is a 32-bit processor, from the aspect of the ALU, but Pentium 4 is actually a 16-bit processor, double pumped 16-bit ALU giving the equivalent effects of a 32-bit ALU. More than 20 pipeline stages needs a larger fan with even more noises, that is not that crazy thing around Pentium 4 processor. I say it is Pentium 4, not Pentium IV. It might be the fourth generation of Pentium series processors, but the digit "4" was emphasised all those days, then people just laid their focus onto the quad-pump FSB and RAMBUS. But this digit "4" is actually short for "64", or in other words, Pentium 4 is a 64-bit architecture processor since it was introduced, only without enabling 64-bit ISA. Of course, this 64-bit ISA or 64-bit generation computing capability is quite different from AMD64 architecture. They might simply expand the registers from 32-bit though 64-bit, fully expand the physical addressing capability up to 46-bit. This expansion is not an introduced mode, but just the enhanced capability to the IA-32 protected mode, with which 32-bit OS could run 64-bit applications, but very different from the ways when Apple Mac OS X did. Very unfortunately, comparing with IA-64 architecture it counts as nothing at all, leaving much rooms for AMD to develop their own version of 64-bit extension of x86 architecture, and that eventually evolve into another architecture, AMD64 or x86-64, rather than 64-bit version of x86. The AMD64 is an associated architecture, it is a standalone 64-bit x86-like architecture, but associated with a modified version of IA-32 protected mode (compatibility mode) for compatibilities with x86 applications; and the AMD64 processors incorporated the entire x86 runtime when processor is on the initialisation. This product leaves the end users a false view that it is the 64-bit x86 processor, no, never, it is an x86-64 processor, and it adapts itself in the ecological environment of x86.


              When Intel realised their idealised views put onto the Itanium a failure, they exposed the 64-bit computing capability of Pentium 4 in the form nearly resemble to AMD64, they misinterpret everything. There are many textbooks tried to make them two as the one, and the students would be put into confusions, hard to be solved. In my table, I do not count the following generations of x86 processors after the Enhanced Platform, because they are almost x86-64 processors, rather than merely x86. 

              • Re: Wikipedia Assembly Language Programming

                With help of an Venice core based AMD64 processor, I revisited the days when VIA delighted their fire on the AMD64 computer. As to the above table, I still have questions, not only on the organisation, but also on the essence of AMD64 processors, or x86-64 processors.


                An AMD64 processor is a 64-bit processor from AMD, it backward supports x86 software, not only the x86 applications but also the x86 Operating System.


                An Intel 64 processor is a processor from Intel, which supports the Intel 64 architecture. In Intel documents, there is no words like "backward" and/or "support".


                They both more or less the same, but different, but ironically, this difference could not easily be seen. In order to delight the darkness on the road, native architecture should be mentioned first. The native architecture of AMD64 processor is AMD64; whereas the native architecture of Intel 64 processor is the IA-32 architecture and its invisible 64-bit extension, on which the Intel 64 architecture is implemented. So there leaves another question, how the x86 ISA implemented on the AMD64 processor? Purely hardware implements or emulated?


                I guess on the AMD64 processor, the x86 ISA is implemented by a special way of emulation, or architecture level emulation. Or in other words, scaling down the AMD64 architecture, and provides enough acceleration to emulation. Then the underlying microarchitecture only realises the AMD64 architecture. The proofs could be inferred from the fact that VMWARE workstation could host the 64-bit client OS on the 32-bit host OS without the need of hardware based virtualisation, or in other words, there is no modes transition. The very secret is that a 64-bit processor would never transform into a 32-bit one, but it could portray as a 32-bit one. Or in other words the legacy mode is hosted on the AMD64 architecture, secret routes could grasp the processor to its native architecture. Make a comparison with Transmeta processor, which hosts the x86 architecture on the VLIW based architecture with help of software interpreter, Morph; then the AMD64 processor hosts the x86 architecture on the AMD64 architecture in a very smart way. Or in other words the actual 64-bit architecture introduced by AMD presents only partial on the 64-bit mode, it hides parts of itself used to work like an x86 processor. That might be the very reasons that Intel 64 processor could not host 64-bit OS if without hardware based virtualisation.


                There exist obvious distinctions between AMD64 and Intel 64 architectures, but they both share a very common documented architecture, designed by AMD, AMD64. On the table I re-organised, I refer those processors as x86-64 processors, reflecting this the common and documented 64-bit architecture processors. I also plan to generation the x86-64 processors based on the progress of AMD64 processor. The first generation of AMD64 processor span the entire socket 754 based processor, the 90nm slimmed down version, Venice core could put a stop to it, from C0 through E6. During this process, the AMD64 architecture grows from immature to ripe, such as the 32-bit segmentation limit check introduced onto the Winchester (D0) core, and SSE3 instruction extension introduced onto the Venice core... 

                • Re: Wikipedia Assembly Language Programming

                  What is a 64-bit processor, and what is an x64 processor? Are they the same or not?


                  AMD64 processor is a 64-bit processor, but rather than possessing a native platform, it extends the existing 32-bit x86 platform with 64-bit general computing capabilities. Such processors hide the real data bus behind the higher level connection (Hyper Transport), so it is not very clear to see its nature. Thanks to the third party chipset designers, such as VIA, those processors present themselves onto the x86 system with a very familiar face (interface).


                  The very first Intel 64 processor is an EM64T enabled Netburst based XEON processor, and those processors on the desktop market are the very famous Pentium 4 Prescott. Prescott is the enhanced version of the Netburst cores found in Northwood and Willamette. Pentium 4 processor is not a traditional IA-32 processor, because its internal ALU is 16-bit data path, but further pipelined with double pumped. This produces the equivalent effect as a traditional 32-bit ALU. The good thing is that the pipeline stages have been further slimmed, saving the time paid for each cycle, and increasing the processor frequency with a fair trade-off. The bad thing is that the double pumped ALU would work even faster and produce even more heat. 16-bit ALU seems a little ridiculous towards the 32-bit processor architecture, but remember that the primitive unit of x86 is 16-bit. Such design could implement 32-bit computing and accelerate the 16-bit codes of early x86 software. Prescott is an enhanced core with implementation of EM64T, disabled or enabled. If disabled, such processors seems nothing different from the traditional Pentium 4 processors; but if enabled, those processors would work similar as AMD64 processors. So the questions on the Presscott's ALU brings out puzzles, whether it is 16-bit or 32-bit, double pumped or quad pumped. Intel gave no information on such things, but more likely when working under IA-32e mode (Long Mode), the internal ALU turns from 16-bit to 32-bit, double pumped. The very first Intel 64 processor for desktop is packaged on the form of LGA, requiring a new series chipsets (Intel 9xx), but later Intel also released very few EM64T enabled processors on the package of Socket 478, assisting the special clients to upgrade their existing platform with EM64T technology with less cost, in other words, chipsets such as Intel 848 and 865 could also support such processors, without needing any modifications. If the changes to the system are only a new processor and an updated version of firmware to provide with 64-bit general computing capabilities, then this procedure is called being extended (on the side the system), or extending (on the side of processor). So the final system is called 64-bit extended system, and the processor which extends the system is called 64-bit extending processor. Both of the system and the processor could be denoted as the x64 system and the x64 processor. The 64-bit extended system is different from a real 64-bit system, because there is nothing happened to change, except the ISA. In the case of Pentium 4 Prescott, it is hardly to say it is a 32-bit or 64-bit processor from the aspect of the internal ALUs, so it is often called Intel 64 architecture processor.


                  Comparing with what Intel did, AMD64 processors seems went a little bit further, it also extends partial parts of x86 system, not merely replacing a processor core, they also extends the physical address capability from 36-bit (PAE) through 40-bit or 48-bit. Even those AMD64 processors need new kind of chipsets to build up new systems, but those systems are rooted onto the traditional x86 platform, so they are also the x64 processors and x64 systems...


                  On the table I reorganised, I refer those x64 processors as x86-64, implying that those processors are on the procedure evolving from x64 towards the real 64-bit processors.

                  • Re: Wikipedia Assembly Language Programming

                    If x64 is the 64-bit extended system or 64-bit extending processor, then what is x32?


                    If the Windows XP could be extended to support the 64-bit applications with a third kernel besides PAE-enabled one, then there should not be an x64 edition. But the problem is that x86-64 is backward compatible with the most x86 applications, rather than the 64-bit version of x86. So it is almost impossible to implement such a thing. AMD64 processor is a 64-bit extending processor, the system based on the AMD64 processor is a 64-bit extended system, but the AMD64 architecture is not merely a 64-bit extension of x86 but also another architecture similar with x86, or 64-bit x86-like architecture. The obvious differences between the AMD64 and IA-32 architectures make it even difficult to design such additional kernel of existing Windows XP. But there are something ridiculous, but reasonable.


                    AMD64 processor could virtualise 64-bit OS without needing AMD-V enabled; while Intel 64 could not virtualise 64-bit OS if without Intel VT enabled. But Apple did really implement such a system on their Mac OS X product with support the 64-bit applications with 32-bit kernel. Of course, the real 32-bit kernel has no chances to execute the 64-bit applications, the only possibility is the effort to extend the 32-bit to enable it with support of 64-bit applications. The good thing is that 32-bit drivers could be used to support the devices under 32-bit kernel when running 64-bit applications, the bad thing is that the 64-bit applications could only utilise the resources provided by the 32-bit kernel. The IA-32 and AMD64 use two different sets of interrupt mechanisms, how could it be possible for such thing? Apple must have to use the undocumented instructions of Intel 64 processors. Another important thing could not be denied that Apple Mac OS X is a multi layer-lised OS, the developers put too much considerations on such things when designing its ancestor NeXT, so the kernel extensions and device drivers could be more easily portable than other OSs. So they do need only hid the underlying parts and provide a light layer used for platform emulation. When the system is initialised, the 32-bit EFI firmware would load the proper EFI boot loader, which then checks if the EM64T is enabled, if not, initialise the system with the pure 32-bit kernel; if enabled, initialise the system with the 64-bit extended 32-bit kernel. This extended 32-bit kernel would switch the processor from the 32-bit protected mode to the compatibility mode of IA-32e mode, in which environment most system and kernel routines would execute. The underlying system routines are programmed in the x86-64, but with help of the light emulation layer mentioned above, kernel extensions and device driver would work as if they were in the real 32-bit protected mode. x86-64 codes also provide the adequate resources to set-up the environment for the 64-bit applications. In such system, the most part is written with 32-bit x86 codes, only a few and necessary part are written in x86-64 codes, or in other words, 64-bit x86-64 codes extends the legacy 32-bit x86 codes, so this system is called x32 system, and Mac OS X is a complete design for this kind of system.


                    Designing an x32 system is little bit effortless than develop an x64 system from the ground, but AMD64 or x86-64 architecture is improper for such system, it would brings out more troubles and problems from the natures of such architecture. Without letting such thing happen, Apple drop such support since OS X 10.8 was released. In order to support both x86 and x86-64 architecture on a single distribution, Knoppix developers, design an ABI level x32 system. It provides two kernel within a single OS image, a pure x86 kernel and a x86-64 kernel with aware of x32 libs. The x32 libs play a role of interpreter, when 32-bit applications call the system routines, those libs would serve an interpreter between the real 64-bit corresponding libs and 32-bit applications, the former would further communicate with the 64-bit kernel routines. Even though 64-bit kernel is provided, but there is no support to 64-bit applications. So it is a partial x32 system.


                    Letter "x" in both x64 and x32 has another meaning in common besides eXtending or eXtended, and that is cross, crossing or crossed between x86 and x86-64. And such thing often happen in the transitioning era. On the table I reorganised, I call this era as the 64-bit extended era. With UEFI firmware take over the legacy BIOS, and more and more applications programmed in x86-64 codes, the importance of x86 would get more and more trivial, in the idealised imagination, the pure x86-64 would take over this 64-bit 64-bit extended era. But I have to say that AMD64 is not an effective architecture, programmes written in such codes consume much more resources than x86 to implement the similar functions, so that is the reason why x86 is emphasising all the time. Even the future Windows 10 ARM edition would also provides the support for x86 rather than x86-64 with emulation...