Connect with us

Technology

China is looking to overtake ChatGPT

Published

on

China is looking to overtake ChatGPT

China is looking to overtake ChatGPT. A Chinese company that recently unveiled its AI model that can operate across four major industry sectors has now announced that it wants to build a better AI model than ChatGPT by October.

China is looking to overtake ChatGPT

Chinese company iFlytek, a voice recognition company based in China, is the latest entrant in the race to build and develop artificial intelligence models. The company recently unveiled its language model called SparkDesk in a live event, which it claims will outperform OpenAI’s chatbot ChatGPT by October of this year.

When OpenAI released its chatbot ChatGPT last year, it actually defined a benchmark bot for others, which everyone is now competing with and trying to outdo this productive artificial intelligence.

While companies like Google and the Chinese company Baidu have launched their own versions of these chatbots, ChatGPT remains a pioneer and leader in the race thanks to its groundbreaking capabilities.

OpenAI also released an advanced version of the AI model used to power ChatGPT. Now with the passage of time, we see that others like Alibaba are making rapid progress, and now that iFlytek has confidently officially claimed that it will leave ChatGPT behind.

Read More: Is Your Phone Screen Jumping or Flickering?

SparkDesk vs ChatGPT

Liu Xingfeng, president of iFlytek, claims that the company analyzed ChatGPT’s main paths and, using information about its customers’ needs, concluded that a general AI needs seven core competencies, which include text generation, Language comprehension, question and answer, logical reasoning, ability to perform mathematical calculations, ability to generate code and multimodal ability.

During its live event, iFlytek demonstrated seven SparkDesk capabilities, forcing the AI model to create stories using characters from two different time periods. It also featured voice recognition in both Chinese and English.

The company also said that the generative AI can handle a wide range of tasks such as writing emails, drafting press releases, and creating schedules, and is even better at generating long chunks of text compared to ChatGPT.

The company claims that SparkDesk is also better than ChatGPT at answering questions and being able to perform math calculations, but it will outperform the OpenAI product overall by October of this year.

IFlytek also shared SparkDesk applications in areas such as education, automotive, office, and digital workforce. These applications include using artificial intelligence to correct student essays, acting as virtual tutors as a learning aid for students, creating minutes based on handwritten notes, as well as smarter driving experiences.

According to PanDaily, this AI model can also help employees perform repetitive tasks in the workplace.

Interestingly, iFlytek is one of the companies blacklisted by the US Department of Commerce, which is prohibited from purchasing critical components.

Bloomberg reported that the announcement of the chatbot’s launch comes as China’s Internet Regulatory Authority has released draft operating guidelines for productive AI services.

Technology

What is CPU; Everything you need to know about processors

Published

on

By

CPU
How does the central processing unit or CPU, which manages the execution of all instructions and is often referred to as the brain of the computer, work and what are its components?

What is CPU; Everything you need to know about processors

The central processing unit ( CPU ) is considered a vital element in any computer and manages all calculations and instructions that are transferred to other computer components and its peripheral equipment. Almost all electronic devices and gadgets you use; From desktops and laptops and phones to gaming consoles and smartwatches, everyone is equipped with a central processing unit; In fact, this unit is considered basic for computers, without which the system will not turn on, let alone be usable. The high speed of the central processing unit is a function of the input command, and the components of the computer only gain executive power if they are connected to this unit.

Table of contents
  • What is a processor?
  • Processor performance
  • Operating units of processors
  • Processor architecture
  • Set of instructions
  • RISC vs. CISC or ARM vs. x86
  • A brief history of processor architecture
  • ARM and X86-64 architecture differences
  • Processor performance indicators
  • Processor frequency
  • cache memory
  • Processing cores
  • Difference between single-core and multi-core processing
  • Processing threads
  • What is hypertrading or SMT?
  • CPU in gaming
  • What is a bottleneck?
  • Setting up a balanced system

Since the central processing units manage the data of all parts of the computer at the same time, it may work slowly as the volume of calculations and processes increases, or even fail or crash as the workload increases. Today, the most common central processing units on the market consist of semiconductor components on integrated circuits, which are sold in various types, and the leading manufacturers in this industry are AMD and Intel, who have been competing in this field since 50 years ago.

What is a processor?

To get to know the central processing unit (CPU), we first introduce a part of the computer called SoC very briefly. SoC, or system on a chip, is a part of a system that integrates all the components a computer needs for processing on a silicon chip. The SoC has various modules, of which the central processing unit (abbreviated as CPU) is the main component, and the GPU, memory, USB controller, power management circuits, and wireless radios (WiFi, 3G, 4G LTE, etc.) are miscellaneous components that may be necessary. not exist on the SoC. The central processing unit, which from now on and in this article will be called the processor for short, cannot process instructions independently of other chips; But building a complete computer is only possible with SoC.

The SoC is slightly larger than the CPU, yet offers much more functionality. In fact, despite the great emphasis placed on the technology and performance of the processor, this part of the computer is not a computer in itself, and it can finally be introduced as a very fast calculator that is part of the system on a chip or SoC; It retrieves data from memory and then performs some kind of arithmetic (addition, multiplication) or logical (and, or, not) operation on it.

Processor

Processor performance

Intel vs AMD CPU comparison
  • Comparison of Intel and AMD CPUs; All technical specifications and features

The process of processing instructions in the processor includes four main steps that are executed in order:

Calling or retrieving instructions from memory (Fetch): The processor first receives these instructions from memory in order to know how to manage the input and know the instructions related to it. This input may be one or infinitely many commands that must be addressed in separate locations. For this purpose, there is a unit called PC (abbreviation of Program Counter) or program counter, which maintains the order of sent commands; The processor is also constantly communicating with RAM in a cooperative interaction to find the address of the instruction (reading from memory).

Decoding or translation of instructions (Decode): Instructions are translated into a form that can be understood by the processor (machine language or binary). After receiving the commands, the processor needs to translate these codes into machine language (or binary) to understand them. Writing programs in binary language, from the very beginning, is a difficult task, and for this reason, codes are written in simpler programming languages, and then a unit called Assembler converts these commands into executable codes ready for processor processing.

Processing or execution of translated instructions (Execute): The most important step in the processor’s performance is the processing and execution of instructions. At this stage, the decoded and binary instructions are processed at a special address for execution with the help of the ALU unit (abbreviation of Arithmetic & Logic Unit) or calculation and logic unit.

Storage of execution results (Store): The results and output of instructions are stored in the peripheral memory of the processor with the help of the Register unit, so that they can be referred to in future instructions to increase speed (writing to memory).

Cycle of instructions

The process described above is called a fetch-execute cycle, and it happens millions of times per second; Each time after the completion of these four main steps, it is the turn of the next command and all steps are executed again from the beginning until all the instructions are processed.

Operating units of processors

Each processor consists of three operational units that play a role in the process of processing instructions:

Arithmetic & Logic Unit (ALU): This is a complex digital circuit unit that performs arithmetic and comparison operations; In some processors, the ALU is divided into two sections, AU (for performing arithmetic operations) and LU (for performing logical operations).

Memory Control Unit (CU or Program Counter): This is a circuit unit that directs and manages operations within the processor and dictates how to respond to instructions to the calculation and logic unit and input and output devices. The operation of the control unit in each processor can be different depending on its design architecture.

Register unit (Register): The register unit is a unit in the processor that is responsible for temporarily storing processed data, instructions, addresses, sequence of bits, and output, and must have sufficient capacity to store these data. Processors with 64-bit architecture have registers with 64-bit capacity, and processors with 32-bit architecture have 32-bit registers.

Processor architecture

The relationship between the instructions and the processor hardware design forms the processor architecture; But what is 64 or 32-bit architecture? What are the differences between these two architectures? To answer this question, we must first familiarize ourselves with the set of instructions and how to perform their calculations:

Set of instructions

An instruction set is a set of operations that any processor can execute naturally. This operation consists of several thousands of simple and elementary instructions (such as addition, multiplication, transfer, etc.) whose execution is defined in advance for the processor, and if the operation is outside the scope of this set of instructions, the processor cannot execute it.

As mentioned, the processor is responsible for executing programs. These programs are a set of instructions written in a programming language that must be followed in a logical order and exactly step-by-step execution.

Related articles:
  • What is the difference between mobile, laptop, desktop, and server processors?

Since computers do not understand programming languages ​​directly, these instructions must be translated into a machine language or binary form that is easier for computers to understand. The binary form consists of only two numbers zero and one and shows the two possible states of on (one) or off (zero) transistors for the passage of electricity.

In fact, each processor can be considered a set of electrical circuits that provide a set of instructions to the processor, and then the circuits related to that operation are activated by an electrical signal and the processor executes it.

Instructions consist of a certain number of bits. For example, in an 8-bit instruction; Its first 4 bits refer to the operation code and the next 4 bits refer to the data to be used. The length of an instruction set can vary from a few bits to several hundreds of bits and in some architectures it has different lengths.

In general, the set of instructions is divided into the following two main categories:

  • Computer calculations with a reduced instruction set (Reduced instruction set computer): For a RISC-based processor (read risk), the set of defined operations is simple and basic. These types of calculations perform processes faster and more efficiently and are optimized to reduce execution time; RISC does not need to have complex circuits and its design cost is low. RISC-based processors complete each instruction in a single cycle and only operate on data stored in registers; So, they are simple instructions, they have a higher frequency, the information routing structure in them is more optimal, and they load and store operations on registers.
  • Complex instruction set computer: CISC processors have an additional layer of microcode or microprogramming in which they convert complex instructions into simple instructions (such as addition or multiplication). Programmable instructions are stored in fast memory and can be updated. In this type of instruction set, a larger number of instructions can be included than in RICS, and their format can be of variable length. In fact, CISC is almost the opposite of RISC. CISC instructions can span multiple processor cycles, and data routing is not as efficient as RISC processors. In general, CISC-based processors can perform multiple operations during a single complex instruction, but they take multiple cycles along the way.

RISC vs. CISC or ARM vs. x86

RISC and CISC are the two starting and ending points of this spectrum in the instruction set category, and various other combinations are also visible. First, let’s state the basic differences between RISC and CISC:

RICS or Reduced Code of Practice

CISC or Complex Instruction Set

RISC instruction sets are simple; They perform only one operation and the processor can process them in one cycle.

CISC instructions perform multiple operations, but the processor cannot process them in a single cycle.

RISC-based processors have more optimized and simpler information routing; The design of these commands is so simple that they can be implemented in parts.

CISC-based processors are complex in nature, and instructions are more difficult to execute.

RISC-based processors require stored data to execute instructions.

In CISC-based processors, it is possible to work with instructions directly through RAM, and there is no need to load operations separately.

RISC does not require complex hardware and all operations are performed by software.

CISC design hardware requirements are higher. CISC instructions are implemented using hardware, and software is often simpler than RISC. This is why programs based on the CISC design require less coding and the instructions themselves do a large part of the operation.

As mentioned, in the design of today’s modern processors, a combination of these two sets (CISC or RISC) is used. For example, AMD’s x86 architecture originally uses the CISC instruction set, but is also equipped with microcode to simplify complex RISC-like instructions. Now that we have explained the differences between the two main categories of instruction sets, we will examine their application in processor architecture.

If you pay attention to the processor architecture when choosing a phone or tablet, you will notice that some models use Intel processors, while others are based on ARM architecture.

Suppose that different processors each have different instruction sets, in which case each must be compiled separately for each processor to run different programs. For example, for each processor from the AMD family, it was necessary to develop a separate Windows or thousands of versions of the Photoshop program were written for different processors. For this reason, standard architectures based on RISC or CISC categories or a combination of the two were designed and the specifications of these standards were made available to everyone. ARM, PowerPC, x86-64, and IA-64 are examples of these architecture standards, and below we introduce two of the most important ones and their differences:

A brief history of processor architecture

In 1823, a person named Baron Jones Jacob Berzelius discovered the chemical element silicon (symbol Si, atomic number 14) for the first time. Due to its abundance and strong semiconductor properties, this element is used as the main material in making processors and computer chips. Almost a century later, in 1947, John Bardeen , Walter Brattin and William Shockley invented the first transistor at Bell Labs and received the Nobel Prize.

Silicon atom

The first efficient integrated circuit (IC) was unveiled in September 1958, and two years later IBM developed the first automated mass production facility for transistors in New York. Intel was founded in 1968 and AMD was founded a year later.

The first processor was invented by Intel in the early 1970s; This processor was called Intel 4004 and with the benefit of 2,300 transistors, it performed 60,000 operations per second. The Intel 4004 CPU was priced at 200 and had only 640 bytes of memory:

Intel 4004Intel CPU C4004 P0339

After Intel, Motorola introduced its first 8-bit processor (the MC6800) with a frequency of one to two MHz, and then MOS Technology introduced a faster and cheaper processor than the existing processors used in gaming consoles of the time, namely the Atari 2600 and Nintendo systems. Used like Apple II and Commodore 64. The first 32-bit processor was developed by Motorola in 1979, although this processor was only used in Apple’s Macintosh and Amiga computers. A little later, National Semiconductor released the first 32-bit processor for public use.

In 1993, PowerPC released its first processor based on a 32-bit instruction set; This processor was developed by the AIM consortium (consisting of three companies Apple, IBM, and Motorola) and Apple migrated from Intel to PowerPC at that time.

The difference between 32-bit and 64-bit processor (x86 vs. x64): Simply put, the x86 architecture refers to a family of instructions that was used in one of the most successful Intel processors, the 8086, and if a processor is compatible with the x86 architecture, that processor known as x86-64 or x86-32 for Windows 32 (and 16) versions bit is used; 64-bit processors are called x64 and 32-bit processors are called x86.

The biggest difference between 32-bit and 64-bit processors is their different access to RAM:

The maximum physical memory of x86 architecture or 32-bit processors is limited to 4 GB; While x64 architecture (or 64-bit processors) can access physical memory of 8, 16, and sometimes even up to 32 GB. A 64-bit computer can run both 32-bit and 64-bit programs; In contrast, a 32-bit computer can only run 32-bit programs.

In most cases, 64-bit processors are more efficient than 32-bit processors when processing large amounts of data. To find out which programs your operating system supports (32-bit or 64-bit), just follow one of the following two paths:

  • Press the Win + X keys to bring up the context menu and then click System. -> In the window that opens, find the System type section in the Device specification section. You can see whether your Windows is 64-bit or 32-bit from this section.
  • Type the term msinfo32 in the Windows search box and click on the displayed System Information. -> From the System Information section on the right, find the System type and see if your Windows operating system is based on x64 or X32.

The first route

The second path

ARM was a type of computer processor architecture that was introduced by Acorn in 1980; Before ARM, AMD, and Intel both used Intel’s X86 architecture, based on CISC computing, and IBM also used RISC computing in its workstations. In fact, Acorn was the first company to develop a home computer based on RISC computing, and its architecture was named after ARM itself: Acorn RISC Machine. The company did not manufacture processors and instead sold licenses to use the ARM architecture to other processor manufacturers. Acorn Holding changed the name Acorn to Advanced a few years later.

The ARM architecture processes 32-bit instructions, and the core of a processor based on this architecture requires at least 35,000 transistors. Processors designed based on Intel’s x86 architecture, which processes based on CISC calculations, require at least millions of transistors; In fact, the optimal energy consumption in ARM-based processors and their suitability for devices such as phones or tablets is related to the low number of transistors compared to Intel’s X86 architecture.

In 2011, ARM introduced the ARMv8 architecture with support for 64-bit instructions and a year after that, Microsoft also launched a Windows version compatible with the ARM architecture along with the Surface RT tablet.

ARM and X86-64 architecture differences

The ARM architecture is designed to be as simple as possible while keeping power dissipation to a minimum. On the other hand, Intel uses more complex settings with the X86 architecture, which is more suitable for more powerful desktop and laptop processors.

Computers moved to 64-bit architecture after Intel introduced the modern x86-64 architecture (also known as x64). 64-bit architecture is essential for optimal calculations and performs 3D rendering and encryption with greater accuracy and speed. Today, both architectures support 64-bit instructions, but this technology came earlier for mobile.

When ARM implemented 64-bit architecture in ARMv8, it took two approaches to this architecture: AArch32 and AArch64. The first one is used to run 32-bit codes and the other one is used to run 64-bit codes.

ARM architecture is designed in such a way that it can switch between two modes very quickly. This means that the 64-bit instruction decoder no longer needs to be compatible with 32-bit instructions and is designed to be backward compatible, although ARM has announced that processors based on the ARMv9 Cortex-A architecture will only be compatible with 64-bit instructions in 2023. and support for 32-bit applications and operating systems will end in next-generation processors.

The differences between ARM and Intel architecture largely reflect the achievements and challenges of these two companies. The approach of optimal energy consumption in the ARM architecture, while being suitable for power consumption under 5 watts in mobile phones, provides the possibility of improving the performance of processors based on this architecture to the level of Intel laptop processors. Compared to Intel’s 100-watt power consumption in Core i7 and Core i9 processors or even AMD processors, it is a great achievement in high-end desktops and servers, although historically it is not possible to lower this power below 5 watts.

nanometer process

Processors that use more advanced transistors consume less power, and Intel has long been trying to upgrade its lithography from 14nm to more advanced lithography. The company recently succeeded in producing its processors with the 10nm manufacturing process, but in the meantime, mobile processors have also moved from 20nm to 14nm, 10nm, and 7nm designs, which is a result of competition from Samsung and TSMC. On the other hand, AMD unveiled 7nm processors in the Ryzen series and surpassed its x86-64 architecture competitors.

Nanometer: A meter divided by a thousand is equal to a millimeter, a millimeter divided by a thousand is equal to a micrometer, and a micrometer divided by a thousand is equal to a nanometer, in other words, a nanometer is a billion times smaller than a meter.

Lithography or manufacturing process: lithography is a Greek word that means lithography, which refers to the way components are placed in processors, or the process of producing and forming circuits; This process is carried out by specialized manufacturers in this field, such as TSMC. In lithography, since the production of the first processors until a few years ago, nanometers showed the distances of placing processor components together; For example, the 14nm lithography of the Skylake series processors in 2015 meant that the components of that processor were separated by 14nm. At that time, it was believed that the less lithography or processor manufacturing process, the more efficient energy consumption and better performance.

The distance between the placement of components in processors is not so relevant nowadays and the processes used to make these products are more contractual; Because it is no longer possible to reduce these distances beyond a certain limit without reducing productivity. In general, with the passage of time, the advancement of technology, the design of different transistors, and the increase in the number of these transistors in the processor, manufacturers have adopted various other solutions such as 3D stacking to place transistors on the processors.

The most unique feature of ARM architecture can be considered as keeping the power consumption low in running mobile applications; This achievement comes from ARM’s heterogeneous processing capability; ARM architecture allows processing to be divided between powerful and low-power cores, and as a result, energy is used more efficiently.

big.LITTLE architecture

ARM’s first attempt in this field dates back to the big.LITTLE architecture in 2011, when the large Cortex-A15 cores and the small Cortex-A7 cores arrived. The idea of ​​using powerful cores for heavy applications and using low-power cores for light and background processing may not have been given as much attention as it should be, but ARM experienced many unsuccessful attempts and failures to achieve it; Today, ARM is the dominant architecture in the market: for example, iPads and iPhones use ARM architecture exclusively.

In the meantime, Intel’s Atom processors, which did not benefit from heterogeneous processing, could not compete with the performance and optimal consumption of processors based on ARM architecture, and this made Intel lag behind ARM.

Finally, in 2020, Intel was able to use a hybrid architecture for cores with a powerful core (Sunny Cove) and four low-consumption cores (Tremont) in the design of its 10 nm Lakefield processors, and in addition to this achievement, it also uses graphics and connectivity capabilities. , but this product was made for laptops with a power consumption of 7 watts, which is still considered high consumption for phones.

Another important distinction between Intel and ARM is in the way they use their design. Intel uses its developed architecture in the processors it manufactures and sells the architecture in its products, while ARM sells its design and architecture certification with customization capabilities to other companies, such as Apple, Samsung, and Qualcomm, and these companies They can make changes in the set of instructions of this architecture and design depending on their goals.

Manufacturing custom processors is expensive and complicated for companies that manufacture these products, but if done right, the end products can be very powerful. For example, Apple has repeatedly proven that customizing the ARM architecture can bring the company’s processors to par with x84-64 or beyond.

Apple eventually plans to remove all Intel-based processors from its Mac products and replace them with ARM-based silicon. The M1 chip is Apple’s first attempt in this direction, which was released along with MacBook Air, MacBook Pro and Mac Mini. After that, the M1 Max and M1 Ultra chips also showed that the ARM architecture combined with Apple’s improvements could challenge the x86-64 architecture.

As mentioned earlier, standard architectures based on RISC or CISC categories or a combination of the two were designed and the specifications of these standards were made available to everyone; Applications and software must be compiled for the processor architecture on which they run. This issue was not a big concern before due to the limitations of different platforms and architectures, but today the number of applications that need different compilations to run on different platforms has increased.

ARM-based Macs, Google’s Chrome OS, and Microsoft’s Windows are all examples in today’s world that require software to run on both ARM and x86-64 architectures. Native software compilation is the only solution that can be used in such a situation.

In fact, for these platforms, it is possible to simulate each other’s code, and the code compiled for one architecture can be executed on another architecture. It goes without saying that such an approach to the initial development of an application compatible with any platform is accompanied by a decrease in performance, but the very possibility of simulating the code can be very promising for now.

After years of development, currently, the Windows emulator for a platform based on ARM architecture provides acceptable performance for running most applications, Android applications also run more or less satisfactorily on Chromebooks based on Intel architecture, and Apple, which has a special code translation tool for has developed itself (Rosetta 2) supports older Mac applications that were developed for the Intel architecture.

However, as mentioned, all three perform weaker in the implementation of programs than if the program was written from scratch for each platform separately. In general, the architecture of ARM and Intel X86-64 can be compared as follows:

architecture

ARM

X86-64

CISC vs. RISC

The ARM architecture is an architecture for processors and therefore does not have a single manufacturer. This technology is used in the processors of Android phones and iPhones.

The X86 architecture is produced by Intel and is exclusively used in desktop and laptop processors of this company.

Complexity of instructions

The ARM architecture uses only one cycle to execute an instruction, and this feature makes processors based on this architecture more suitable for devices that require simpler processing.

The Intel architecture (or the X86 architecture associated with 32-bit Windows applications) often uses CISC computing and therefore has a slightly more complex instruction set and requires several cycles to execute.

Mobile CPUs vs. Desktop CPUs

The dependence of the ARM architecture on the software makes this architecture be used more in the design of phone processors; ARM (in general) works better on smaller technologies that don’t have constant access to the power supply.

Because Intel’s X86 architecture relies more on hardware, this architecture is typically used to design processors for larger devices such as desktops; Intel focuses more on performance and is considered a better architecture for a wider range of technologies.

Energy consumption

The ARM architecture not only consumes less energy thanks to its single-cycle computing set but also has a lower operating temperature than Intel’s X86 architecture; ARM architectures are great for designing phone processors because they reduce the amount of energy required to keep the system running and execute the user’s requested commands.

Intel’s architecture is focused on performance, so it won’t be a problem for desktop or laptop users who have access to an unlimited power source.

Processor speed

CPUs based on ARM architecture are usually slower than their Intel counterparts because they perform calculations with lower power for optimal consumption.

Processors based on Intel’s X86 architecture are used for faster computing.

operating system

ARM architecture is more efficient in the design of Android phone processors and is considered the dominant architecture in this market; Although devices based on the X86 architecture can also run a full range of Android applications, these applications must be translated before running. This scenario requires time and energy, so battery life and overall processor performance may suffer.

Intel architecture reigns as the dominant architecture in tablets and Windows operating systems. Of course, in 2019, Microsoft released the Surface Pro X with a processor that uses ARM architecture and could run the full version of Windows. If you are a gamer or if you have expectations from your tablet beyond running the full version of Windows, it is better to still use the Intel architecture.

During the competition between Arm and x86 over the past ten years, ARM can be considered the winning architecture for low-power devices such as phones. This architecture has also made great strides in laptops and other devices that require optimal energy consumption. On the other hand, although Intel has lost the phone market, the efforts of this manufacturer to optimize energy consumption have been accompanied by significant improvements over the years, and with the development of hybrid architecture, such as the combination of Lakefield and Alder Lake, now more than ever, there are many commonalities with processors. It is based on Arm architecture. Arm and x86 are distinctly different from an engineering point of view, and each has its own individual strengths and weaknesses, however, today it is no longer easy to distinguish between the use cases of the two, as both architectures are increasingly supported. It is increasing in ecosystems.

Processor performance indicators

Processor performance has a great impact on the speed of loading programs and their smooth execution, and there are various measures to measure the performance of each processor, of which frequency (clock speed) is one of the most important. So be careful, the frequency of each core can be considered as a criterion for measuring its processing power, but this criterion does not necessarily represent the overall performance of the processor and many things such as the number of cores and threads, internal architecture (synergy between cores), cache memory capacity, Overclocking capability, thermal power, power consumption, IPC, etc. were also considered to judge the overall performance of the processor.

Synergy is an effect that results from the flow or interaction of two or more elements. If this effect is greater than the sum of the effects that each of those individual elements could create, then synergy has occurred.

In the following, we will explain more about the factors influencing the performance of the processor:

Processor frequency

One of the most important factors in choosing and buying a processor is its frequency (Clock Speed), which is usually a fixed number for all its cores. The number of operations that the processor performs per second is known as its speed and is expressed in Hertz, MHz (MHz for older processors), or GHz.

At the same frequency, a processor with a higher IPC can do more processing and is more powerful

More precisely, frequency refers to the number of computing cycles that processor cores perform per second and is measured in GHz (GHz-billion cycles per second).

For example, a 3.2 GHz processor performs 3.2 billion operations per second. In the early 1970s, processors passed the frequency of one megahertz (MHz) or running one million cycles per second, and around 2000 the gigahertz (GHz) unit of measurement equal to one billion hertz was chosen to measure their frequency.

Sometimes, multiple instructions are completed in one cycle, and in some cases, an instruction may be processed in multiple cycles. Since different architectures and designs of each processor perform instructions in a different way, the processing power of their cores can be different depending on the architecture. In fact, without knowing the number of instructions processed per cycle (IPC) comparing the frequency of two processors is completely meaningless.

Suppose we have two processors; One is produced by Company A and the other by Company B, and the frequency of both of them is the same and equal to one GHz. If we have no other information, we may consider these two processors to be the same in terms of performance; But if company A’s processor completes one instruction per cycle and company B’s processor can complete two instructions per cycle. Obviously, the second processor will perform faster than the A processor.

In simpler words, at the same frequency, a processor with a higher IPC can do more processing and is more powerful. So, to properly evaluate the performance of each processor, in addition to the frequency, you will also need the number of instructions it performs in each cycle.

Therefore, it is better to compare the frequency of each processor with the frequency of processors of the same series and generations with the same processor. It’s possible that a processor from five years ago with a high frequency will outperform a newer processor with a lower frequency because newer architectures handle instructions more efficiently.

Intel’s X-series processors may outperform higher-frequency K-series processors because they split tasks between more cores and have larger caches; On the other hand, in the same generation of processors, a processor with a higher frequency usually performs better than a processor with a lower frequency in many applications. This is why the manufacturer company and processor generation are very important when comparing processors.

Base frequency and boost frequency: The base frequency of any processor is the minimum frequency that the processor works with when idle or when performing light processing; on the other hand, the boost frequency is a measure that shows how much the processor performs when performing heavier calculations or more demanding processes. can increase. Boost frequencies are automatically applied and limited by heat from heavy processing before the processor reaches unsafe levels of computing.

In fact, it is not possible to increase the frequency of a processor without physical limitations (mainly electricity and heat), and when the frequency reaches about 3 GHz, the power consumption increases disproportionately.

Cache memory

Another factor that affects the performance of the processor is the capacity of the processor’s cache memory or RAM; This type of RAM works much faster than the main RAM of the system due to being located near the processor and the processor uses it to temporarily store data and reduce the time of transferring data to/from the system memory.

Related articles:
  • What is L2, L1, and L3 cache memory and what effect does it have on processor performance?

Therefore, cache can also have a large impact on processor performance; The more RAM the processor has, the better its performance will be. Fortunately, nowadays all users can access benchmark tools and evaluate the performance of processors themselves, regardless of manufacturers’ claims.

Cache memory can be multi-layered and is indicated by the letter L. Usually, processors have up to three or four layers of cache memory, the first layer (L1) is faster than the second layer (L2), the second layer is faster than the third layer (L3), and the third layer is faster than the fourth layer (L4). . The cache memory usually offers up to several tens of megabytes of space to store, and the more space there is, the higher the price of the processor will be.

CPU cache

The cache memory is responsible for maintaining data; This memory has a higher speed than the RAM of the computer and therefore reduces the delay in the execution of commands; In fact, the processor first checks the cache memory to access desired data, and if the desired data is not present in that memory, it goes to the RAM.

  • Level one cache memory (L1), which is called the first cache memory or internal cache; is the closest memory to the processor and has high speed and smaller volume than other levels of cache memory, this memory stores the most important data needed for processing; Because the processor, when processing an instruction, first of all goes to the level one cache memory.
  • Level two (L2) cache memory, which is called external cache memory, has a lower speed and a larger volume than L1, and depending on the processor structure, it may be used jointly or separately. Unlike L1, L2 was placed on the motherboard in old computers, but today, in new processors, this memory is placed on the processor itself and has less delay than the next layer of cache, namely L3.
  • The L3 cache memory is the memory that is shared by all the cores in the processor and has a larger capacity than the L1 or L2 cache memory, but it is slower in terms of speed.
  • Like L3, L4 cache has a larger volume and lower speed than L1 or L2; L3 or L4 are usually shared.

Processing cores

The core is the processing unit of the processor that can independently perform or process all computing tasks. From this point of view, the core can be considered as a small processor in the whole central processing unit. This part of the processor consists of the same operational units of calculation and logical operations (ALU), memory control (CU), and registers (Register) that perform the process of processing instructions with a fetch-execution cycle.

Processing cores

In the beginning, processors worked with only one core, but today processors are mostly multi-core, with at least two or more cores on an integrated circuit, processing two or more processes simultaneously. Note that each core can only execute one instruction at a time. Processors equipped with multiple cores execute sets of instructions or programs using parallel processing (Parallel Computing) faster than before. Of course, having more cores does not mean increasing the overall performance of the processor. Because many programs do not yet use parallel processing.

  • Single-core processors: The oldest type of processor is a single-core processor that can execute only one command at a time and is not efficient for multitasking. In this processor, the start of a process requires the end of the previous operation, and if more than one program is executed, the performance of the processor will decrease significantly. The performance of a single-core processor is calculated by measuring its power and based on frequency.
  • Dual-core processors: A dual-core processor consists of two strong cores and has the same performance as two single-core processors. The difference between this processor and a single-core processor is that it switches back and forth between a variable array of data streams, and if more threads or threads are running, a dual-core processor can handle multiple processing tasks more efficiently.
  • Quad-core processors: A quad-core processor is an optimized model of a multi-core processor that divides the workload between cores and provides more effective multitasking capabilities by benefiting from four cores; Hence, it is more suitable for gamers and professional users.
  • Six-core processors (Hexa-Core): Another type of multi-core processor is a six-core processor that performs processes at a higher speed than four-core and two-core types. For example, Intel’s Core i7 processors have six cores and are suitable for everyday use.
  • Octa-Core processors: Octa-core processors are developed with eight independent cores and offer better performance than previous types; These processors include a dual set of quad-core processors that divide different activities between different types. This means that in many cases, the minimum required cores are used for processing, and if there is an emergency or need, the other four cores are also used in performing calculations.
  • Ten-core processors (Deca-Core): Ten-core processors consist of ten independent systems that are more powerful than other processors in executing and managing processes. These processors are faster than other types and perform multitasking in the best possible way, and more and more of them are released to the market day by day.

Difference between single-core and multi-core processing

In general, it can be said that the choice between a powerful single-core processor and a multi-core processor with normal power depends only on the way of use, and there is no pre-written version for everyone. The powerful performance of single-core processors is important for use in software applications that do not need or cannot use multiple cores. Having more cores doesn’t necessarily mean faster, but if a program is optimized to use multiple cores, it will run faster with more cores. In general, if you mostly use applications that are optimized for single-core processing, you probably won’t benefit from a processor with a large number of cores.

Let’s say you want to take 2 people from point A to B, of course a Lamborghini will do just fine, but if you want to transport 50 people, a bus can be a faster solution than multiple Lamborghini commutes. The same goes for single-core versus multi-core processing.

In recent years and with the advancement of technology, processor cores have become increasingly smaller, and as a result, more cores can be placed on a processor chip, and the operating system and software must also be optimized to use more cores to divide instructions and execute them simultaneously. allocate different If this is done correctly, we will see an impressive performance.

multi-core
  • How do processors use multiple cores?
  • How do Windows and other operating systems use multiple cores in a processor?

In traditional multi-core processors, all cores were implemented the same and had the same performance and power rating. The problem with these processors was that when the processor is idle or doing light processing, it is not possible to lower the energy consumption beyond a certain limit. This issue is not a concern in conditions of unlimited access to power sources but can be problematic in conditions where the system relies on batteries or a limited power source for processing.

This is where the concept of asymmetric processor design was born. For smartphones, Intel quickly adopted a solution that some cores are more powerful and provide better performance, and some cores are implemented in a low-consumption way; These cores are only good for running background tasks or running basic applications such as reading and writing email or browsing the web.

High-powered cores automatically kick in when you launch a video game or when a heavy program needs more performance to do a specific task.

core and string

Although the combination of high-power and low-consumption cores in processors is not a new idea, using this combination in computers was not so common, at least until the release of the 12th generation Alder Lake processors by Intel.

In each model of Intel’s 12th generation processors, there are E cores (low consumption) and P cores (powerful); The ratio between these two types of cores can be different, but for example, in Alder Lake Core i9 series processors, eight cores are intended for heavy processing and eight cores for light processing. The i7 and i5 series have 8.4 and 6.4 designs for P and E cores, respectively.

There are many advantages to having a hybrid architecture approach in processor cores, and laptop users will benefit the most, because most daily tasks such as web browsing, etc., do not require intensive performance. If only low-power cores are involved, the computer or laptop will not heat up and the battery will last longer.

Low-power cores are simple and inexpensive to produce, so using them to boost and free up powerful, advanced cores seems like a smart idea.

Even if you have your system connected to a power source, the presence of low-power cores will be efficient. For example, if you are engaged in gaming and this process requires all the power of the processor, powerful cores can meet this need, and low-power cores are also responsible for running background processes or programs such as Skype, etc.

At least in the case of Intel’s Alder Lake processors, the P and E cores are designed to not interfere with each other so that each can perform tasks independently. Unfortunately, since combining different processors is a relatively new concept for x86 processors, this fundamental change in the x86 architecture is fraught with problems.

Before the idea of ​​hybrid cores (or the combination of powerful cores or P and low consumption or E) was proposed, software developers had a reason to develop their products. They did not see a form compatible with this architecture, so their software was not aware of the difference between low-consumption and high-consumption cores, and this caused in some cases Reports of crashes or strange behavior of some software (such as Denuvo).

Processing threads

Processing threads are threads of instructions that are sent to the processor for processing; Each processor is normally capable of processing one instruction, which is called the main instruction, and if two instructions are sent to the processor, the second instruction is executed after the first instruction is executed. This process can slow down the speed and performance of the processor. In this regard, processor manufacturers divide each physical core into two virtual cores (Thread), each of which can execute a separate processing thread, and each core, having two threads, can execute two processing threads at the same time.

Active processing versus passive processing

Active processing refers to the process that requires the user to manually set data to complete an instruction; Common examples of active processing include motion design, 3D modeling, video editing, or gaming. In this type of processing, single-core performance and high-core speed are very important, so in the implementation of such processing, we need fewer, but more powerful, cores to benefit from smooth performance.

Passive processing, on the other hand, is instructions that can usually be easily executed in parallel and left alone, such as 3D rendering and video; Such processing requires processors with a large number of cores and a higher base frequency, such as AMD’s Threadripper series processors.

One of the influential factors in performing passive processing is the high number of threads and their ability to be used. In simple words, a thread is a set of data that is sent to the processor for processing from an application and allows the processor to perform several tasks at the same time in an efficient and fast way; In fact, it is because of the threads in the system that you can listen to music while surfing the web.

Threads are not physical components of the processor but represent the amount of processing that the processor cores can do, and to execute several very intensive instructions simultaneously, you will need a processor with a large number of threads.

The number of threads in each processor is directly related to the number of cores; In fact, each core can usually have two threads and all processors have active threads that allocate at least one thread to perform each process.

What is hypertrading or SMT?

Hyperthreading in Intel processors and simultaneous multithreading (SMT) in AMD processors are concepts to show the process of dividing physical cores into virtual cores; In fact, these two features are a solution for scheduling and executing instructions that are sent to the processor without interruption.

Hypertrading

Today, most processors are equipped with hyperthreading or SMT capability and run two threads per core. However, some low-end processors, such as Intel’s Celeron series or AMD’s Ryzen 3 series, do not support this feature and only have one thread per core. Even some high-end Intel processors come with disabled hyperthreading for various reasons such as market segmentation, so it is generally better to read the Cores & Threads description section before buying any processor. Check it out.

Hyperthreading or simultaneous multithreading helps to schedule instructions more efficiently and use parts of the core that are currently inactive. At best, threads provide about 50% more performance compared to physical cores.

In general, if you’re only doing active processing like 3D modeling during the day, you probably won’t be using all of your CPU’s cores; Because this type of processing usually only runs on one or two cores, but for processing such as rendering that requires all the power of the processor cores and available threads, using hyperthreading or SMT can make a significant difference in performance.

CPU in gaming

Before the release of multi-core processors, computer games were developed for single-core systems, but after the introduction of the first dual-core processor in 2005 by AMD and the release of four, six and eight-core processors after that, there is no longer a limit to the help of more cores. did not have Because the ability to execute several different operations at the same time was provided for the processors.

In order to have a satisfactory gaming experience, every gamer must choose a balanced processor and graphics processor (we will examine the graphics processor and its function in a separate article) in a balanced way. If the processor has a weak or slow performance and cannot execute commands fast enough, the system graphics cannot use its maximum power; Of course, the opposite is also true. In such a situation, we say that the graphics has become a bottleneck.

What is a bottleneck?

In the field of computers, botlink (or bottleneck) is said to limit the performance of a component as a result of the difference in the maximum capabilities of two hardware components. Simply put, if the graphics unit receives instructions faster than the processor can send them, the unit will sit idle until the next set of instructions is ready, rendering fewer frames per second; In this situation, the level of graphics performance is limited due to processor limitations.

Butlanc

The same may happen in the opposite direction. If a powerful processor sends commands to it faster than the graphics unit can receive, the processor’s capabilities are limited by the poor performance of the graphics.

In fact, a system that consists of a suitable processor and graphics, provides a better and smoother performance to the user. Such a system is called a balanced system. In general, a balanced system is a system in which the hardware does not create bottlenecks (or bottlenecks) for the user’s desired processes and provides a better user experience without disproportionate use (too much or too little) of system components.

It is better to pay attention to a few points to set up a balanced system:

  • You can’t set up a balanced system for an ideal gaming experience by just buying the most expensive processor and graphics available in the market.
  • Butlink is not necessarily caused by the quality or oldness of the components and is directly related to the performance of the system hardware.
  • Graphics botlinking is not specific to advanced systems, and balance is also very important in systems with low-end hardware.
  • The creation of botlinks is not exclusive to the processor and graphics, but the interaction between these two components prevents this problem to a large extent.

Setting up a balanced system

In the case of gaming or graphics processing, when the graphics do not use their maximum power, the effect of processor power on improving the quality of the user’s gaming experience will be noticeable if there is high coordination between the graphics unit and the processor; In addition, the type and model of the game are also two important factors in choosing hardware. Currently, quad-core processors can still be used to run various games, but Hexa-core processors or more will definitely give you smoother performance. Today, multi-core processors for games such as first-person shooters (FPS) or online multiplayer games are considered essential for any gaming system.

Continue Reading

Technology

Galaxy S24 FE Review

Published

on

By

S24 FE
Galaxy S24 FE, Samsung’s economic flagship, despite the significant increase in processing power, still has some of the weaknesses of the past.

Galaxy S24 FE review; Although insufficient steps in the path of evolution

Nowadays, it is rare to find anyone who is not somehow familiar with the Samsung Fan Edition series. Samsung’s FE series products are very popular thanks to their hardware offerings and build quality on par with flagship phones with a lower price tag. They are often referred to as flagship killers rather than fan-pleasers. On the other hand, the existence of strong competitors such as Xiaomi’s T series is a strong reason for the life of the FE family to continue.

Table of contents
  • Galaxy S24 FE review video
  • Galaxy S24 FE Design: Goodbye to the crowd
  • Galaxy S24 FE screen and speaker: accurate and attractive
  • S24 FE performance and charging: with the power of a full-fledged flagship
  • Galaxy S24 FE software: pseudo-flagship full of artificial intelligence
  • Galaxy S24 FE camera: more or less better and sometimes worse
  • Galaxy S24 FE camera comparison with S23 FE
  • Galaxy S24 FE; A product still involved with constant challenges

Artificial intelligence is considered to be the leading actor these days in the world of technology; Therefore, companies like Samsung have seized the opportunity to make the most of the capacity of the prevailing psychological space to be in the spotlight and use artificial intelligence as an important trump card against competitors.

Now it’s the turn of the FE series to inherit the artificial intelligence capabilities from their flagship brothers/sisters in addition to the hardware features; But not everything is artificial intelligence, and we should not forget that the FE series, with all its merits, is under the microscope; Because its previous generations have not left a very bright record in terms of chip performance stability and battery life. Has Samsung been able to think of a solution for this problem? We will find the answer to this question in the Galaxy S24 FE review.

Galaxy S24 FE was provided to Zomit by Safeservice for review.

Galaxy S24 FE Design: Goodbye to the crowd

The first big change that can be seen in the appearance of the S24 FE is not a change in the design language, but an increase in its size; Now the dimensions of the phone have increased from 6.5 to 6.7 inches, which may be good news for some and unpleasant for others; But since most of the phones have similar dimensions, maybe it would be better if Samsung did not touch the size of the phone to maintain the distinction and have a better chance to attract fans of compact phones.

Working with Galaxy S24 FE in hand

With the increase in the dimensions of the phone, the screen-to-body ratio has reached 88%, which makes the wide margins of the previous generation less noticeable; I emphasize that they are less noticeable because the width of the borders around the screen in the Galaxy S24 FE does not change significantly compared to the previous generation, and you still have to get used to seeing them.

The Galaxy S24 FE’s screen protection has been improved after a major setback in the S23 FE with Gorilla Glass Victus+; But the back glass frame of the phone is still made of Gorilla Glass 5, and due to its shiny and transparent surface, it still has a high talent for absorbing fingerprints; However, it provides good friction with the hand and does not allow the phone to slide.

Galaxy S24 FE camera module design

The curved and matte aluminum frame of the previous generation has been replaced by a flat frame, lest the Galaxy S24FA defy the design language of the Koreans this year; This change helps to maintain the stability of the phone in the hand; But it reduces the elegance of the phone and even with a 0.2mm thickness reduction, it looks rougher than the previous generation.

By increasing the size of the phone and using a larger battery, the weight of the S24 FE has increased by four grams and has reached 213 grams; Although the increase in the weight of the phone is not very noticeable; due to the increase in size, the device does not fit easily in the hand of the previous generation. The combination of these two things made me feel more like a mid-range phone than a flagship killer; Although, nowadays, Samsung phones from low-end to flagship have more or less the same appearance.

Galaxy S24 FE selfie camera
One UI icons on Galaxy S24 FE
USB-C port on Galaxy S24 FE
Galaxy S24 FE physical keys

Galaxy S24 FE is produced in green, blue, yellow, silver, and gray (graphite) colors, relying on the IP68 standard, it can last for 30 minutes in 1.5 meters of water, on the bottom edge of the USB 3.2 Type-C port, slot Speaker and microphone holes and on the top edge hosts the SIM card port and secondary microphone holes and noise canceling. The power and volume up/down buttons are also placed on the right edge with the same arrangement as before.

Galaxy S24 FE screen and speaker: accurate and attractive

The Galaxy S24 FE uses the brand’s Dynamic AMOLED 2X OLED display, which renders content at FHD+ resolution, or 2340 x 1080 pixels, and a 120Hz refresh rate. Due to the 0.2-inch increase in screen size, the pixel density has been reduced to a very small amount that cannot be detected by the naked eye.

Galaxy S24 FE screen

The Galaxy S24 FE panel is not LTPO and is low-power, and the refresh rate only moves between 60 and 120 Hz depending on the content; Therefore, the always-on display or AOD of the phone, unlike the S24 series, does not have the ability to display the background image and only displays the clock and notifications.

The S24 FE screen, like other Samsung phones, uses 8-bit color depth with the ability to display 16 million colors; Therefore, standards such as Dolby Vision are not supported, and you can only watch content with the HDR10+ standard; But the good news is that the brightness rate of the device when playing such content reaches more than 2240 nits based on the measurement of Zoomit; Of course, the maximum brightness under sunlight does not exceed 1695 nits, which is still 30% brighter than before.

Watch movies with the Galaxy S24 FE

The S24 FE display has two color profiles, Natural and Vivid, which are defined in the sRGB and DCI-P3 color spaces. From the S24 phones onwards, Samsung made changes to the Vivid profile that, according to its own words, shows colors more naturally; Therefore, by switching from the Natural profile to Vivid, by default, the colors do not recover the past freshness and wider coverage of the DCI-P3 color range, and we do not see any special change except the colors becoming colder.

Galaxy S24 FE display performance against the competition

Product/Test

Minimum brightness

Maximum brightness

contrast ratio

sRGB

DCI P3

manual

Automatic

local

cover

Average error

cover

Average error

Galaxy S24 FE

1.7

520

2240

(HDR)

99.5 percent

(Natural)

1.4

94.3

percentage

(Vivid)

3.2

Galaxy S23 FE

1.9

607.5

1315

98.7

percentage

(Natural)

1.5

100

percentage

(Vivid)

4

Galaxy S24 Plus

0.8

587

2965

(HDR)

95.6 percent

(Natural)

3.2

83.3 percent

(Vivid)

4.6

Motorola Edge 50 Pro

2/6

640

2300

(HDR)

100 percent

(Natural)

0.8

100 percent

(Vibrant)

1.2

Pixel 8

2

1228

2200

(HDR)

97.9 percent

(Natural)

0.8

84.3 percent

(Adaptive)

2.8

To experience more vivid colors, you can use the slider in the Advanced Settings section to increase the color intensity. By increasing Vividness as much as possible, you can restore that past visual experience; However, a small part of the DCI-P3 range falls under the S24 FE and does not cover it 100%.

The Natural profile displays completely neutral colors with high accuracy; Of course, the accuracy of the Vivid profile is also very good with DeltaE 3.2; But you will see the same tendency to be cold in the display of colors by choosing this profile.

Galaxy S24 FE volume control slider

There is an optical fingerprint sensor under the display, which is slightly lower than the S23 FE and its performance is as fast and accurate as before; The speed and accuracy of the device’s optical sensor cannot be compared with ultrasonic sensors; However, you can rest assured that there is no problem with the fingerprint recognition on the protective glass.

The Galaxy S24 FE uses the speakerphone as a second speaker to produce stereo sound. Speakers provide high and balanced sound volume. As with the Galaxy S23 FE, sound separation is good as long as you don’t plan on playing loud music; However, the bass may not have that usual thump in some songs.

S24 FE performance and charging: with the power of a full-fledged flagship

The presence of Exynos chips in the heart of Fan Editions has become a tradition and Samsung has chosen to follow the same procedure this year. Now that after a gap of one year, the basic and plus models of the S24 series have also been launched with the Exynos chip, Samsung is equipping it with the Exynos 2400e chip to differentiate between the flagship series and the fan edition. In addition, unlike last year, there is no news about the Snapdragon version and all S24 FE models are launched with the Samsung chip.

Technical specifications of the chip

Specifications/Chip

Exynos 2400

Exynos 2400e

Exynos 2200

Central processor

1 3.21 GHz Cortex-X4 core

2 2.9 GHz Cortex-A720 cores

3 2.6 GHz Cortex-A720 cores

4 cores of 1.95 GHz Cortex-A520

8 MB system cache

1 3.11 GHz Cortex-X4 core

2 2.9 GHz Cortex-A720 cores

3 2.6 GHz Cortex-A720 cores

4 cores of 1.95 GHz Cortex-A520

8 MB system cache

1 2.8 GHz Cortex-X2 core

3 2.52 GHz Cortex-A710 cores

4 cores of 1.82 GHz Cortex-A510

Graphics

1095 MHz Xclipse 940 unit

1095 MHz Xclipse 940 unit

1306 MHz Xclipse 920 unit

Memory controller

4 16-bit channels

RAM 4200 MHz LPDDR5X

The bandwidth is 68.2 GB

4 16-bit channels

RAM 4200 MHz LPDDR5X

The bandwidth is 68.2 GB

4 16-bit channels

RAM 4200 MHz LPDDR5

The bandwidth is 51.2 GB

Record and play video

8K30 / 4K120 10-bit H.265

8K30 / 4K120 10-bit H.265

8K30 / 4K120 10-bit H.265

Wireless connection

Bluetooth 5.4 and Wi-Fi 7

Bluetooth 5.4 and Wi-Fi 7

Bluetooth 5.2 and Wi-Fi 6

modem

Exynos 5300 modem

Download 9640 MB

Upload is 2550 megabits

Exynos 5300 modem

Download 9640 MB

Upload is 2550 megabits

Exynos 5300 modem

Download 7350 megabits

Upload is 3670 megabits

manufacturing process

Samsung 4 nanometers

Samsung 4 nanometers

Samsung 4 nanometers

The Exynos chip at the heart of the Galaxy S24 FE is not much different from the Exynos 2400 at the heart of the flagships of the S24 series, and only the frequency of its powerful core is 100 MHz lower. Samsung says that the change in the frequency of the high-powered Exynos 2400e core helps to improve the chip’s energy consumption, higher stability, and better temperature control in various applications.

Running Call of Duty on Galaxy S24 FE

Looking at the benchmark results in the table below, we can see that the 100 MHz reduction in processor frequency in the usual benchmarks does not have a noticeable effect on the performance of the S24 FE compared to the S24, and with a small difference, it appears weaker in single-core and multi-core processing. Due to the same graphics processor, the performance of the S24 FE in this section is not significantly different from its flagship family.

Galaxy S24 FE performance against the competition

Product/benchmark

chip

Speedometer 2.1

GeekBench 6

GFXBench

Web browsing experience

GPU computing power

CPU computing power

Game simulator

Vulkan/Metal

Single/Multi

Aztec Ruins

Onscreen/1440p

Vulkan/Metal

Galaxy S24 FE

Exynos 2400e

300

15539

2077

6268

101

71

Galaxy S23 FE

Exynos 2200

88

9874

1608

4014

55

38

Pixel 8a

Tensor G3

135

6404

1763

4384

60

40

Xiaomi 13T Pro

Density 9200 Plus

121

7197

1292

3591

75

62

Notingphone 2

Snapdragon 8+ Gen 1

155

6746

1415

3959

60

51

Galaxy S24

Exynos 2400

242

16233

2141

6618

107

68

The generation gap between the S24 FE chip and the S23 FE is so great that it makes any comparison irrelevant. The Exynos 2400e not only beats its predecessor in all areas, including CPU and GPU processing power but also beats almost all the pseudo-flagships of other brands that we have reviewed at Zoomit.

Samsung Fan Edition 2024 appears 30% better in single-core processing and 60% better in multi-core processing and graphic computing power than the previous generation. The game simulator test also shows that thanks to an 80% improvement in results, you can count on the S24 FE with more confidence in running games.

Running Galaxy Impact on Galaxy S24 FE

In playing a heavy game like Genshin Impact, you can have a better evaluation of the performance of the Samsung chip in heavy and more realistic scenarios. In the situation where we set the game’s graphic settings to the highest setting, the game ran for a short time with an average of 55 frames per second; But after a few minutes, as the device warmed up to about 43 degrees Celsius, the performance dropped to 40 frames per second. Sudden frame drops were another problem we experienced in Genshin Impact.

Zomit has used GameBench software to obtain game frame rates.

GPU performance stability testing can give a better view of how the chip handles performance and temperature during heavy and long usage. The Galaxy S24 FE starts the test with a nearly 1,000-point gap over its S24 sibling but ends the test with an equal score. Such conditions show that the performance drop of the 2400e chip in heavy use is less than that of the Exynos 2400, which somehow indicates better performance management.

In terms of temperature control, the situation is not very favorable for the Samsung chip, and considering that we do not have a very good memory of the temperature condition of the Exynos chips, we look at this category with more sensitivity. During the stress test, the temperature of the phone’s body rose up to 46 degrees Celsius, which cannot be described by the words “progress” or “improvement”.

Galaxy S24 FE battery settings

The Galaxy S24 FE battery has increased by 200 mAh compared to last year’s model, and considering that this year the screen size has also increased, the result obtained in the battery charging test looks promising. Galaxy S24 FE can charge for about 13 hours in daily use; Which means 4 more hours of charging than the previous generation.

Galaxy S24 FE battery life versus the competition

Product/benchmark

Display

battery

Play video

Everyday use

Dimensions, resolution, and refresh rate

milliampere hour

minute: hour

minute: hour

Galaxy S24 FE

6.7 inches, 120 Hz

2412 x 1080 pixels

4700

13:04

Galaxy S23 FE

6.4 inches, 120 Hz

2340 x 1080

pixel

4500

9:11

Pixel 8a

6.1 inches, 120 Hz

2400 x 1080 pixels

4492

20:00

12:00

Xiaomi 13T Pro

6.67 inches, 120 Hz

2712 x 1220 pixels

5000

16:39

10:49

Notingphone 2

6.7 inches, 120 Hz

2412 x 1080 pixels

4700

25:50

14:58

Galaxy S24

6.2 inches, 120 Hz

2340 x 1080 pixels

4000

27:55

12:51

This year, Galaxy S24 FE will enter the market with only 8 GB of RAM, which is not a good capacity for a pseudo-flagship Android device due to the artificial intelligence fever. In addition to models with storage space of 128 and 256 GB, its 512 GB version is also available to users. The 128GB model uses UFS 3.1 storage.

Galaxy S24 FE storage speed compared to competitors

phone model

Sequential reading rate

Sequential write rate

Galaxy S24 FE

2426 MB UK

1514 MB UK

Galaxy S24 Ultra

2473 megabytes

1471 megabytes

We had the 256GB model for review. According to the results of the benchmarks, it seems that the storage space in the 256 and 512 GB models is of the UFS 4.0 type, which provides high speed in reading and writing data.

Galaxy S24 FE software: pseudo-flagship full of artificial intelligence

In addition to the flagship chip, the Galaxy S24 FE inherits many software features from the Galaxy S24 family. Since we’ve been hearing Samsung’s AI a lot since the S24 series was announced, we expected to see at least some of Samsung’s native AI capabilities in this year’s Fan Edition. Fortunately, in the software department, the Koreans have been especially kind to their flagship this year.

Circle to Search feature on Galaxy S24 FE

Galaxy S24 FE offers a complete package of Samsung’s artificial intelligence features in its software. Functions such as Sketch To Image, live translation of calls or Live Translate, photo editing with artificial intelligence, artificial intelligence assistant functions such as Chat Assist and Note Assist, and of course, the attractive and extremely practical Circle To Search function are also present in S24 FE.

Thanks to the powerful chip, the Galaxy S24 FE can process some language models of artificial intelligence without the need for an Internet connection and also access the offline version of Google’s artificial intelligence, Gemina Nano. Since we have already talked about Samsung’s AI capabilities in detail in our Galaxy S24 Ultra review article, we suggest you check out this article.

Super HDR feature on Galaxy S24 FE

The Galaxy S24 FE runs Android 14 out of the box, and even if we leave out the AI ​​capabilities, we’ll find some interesting features at its heart. For example, the S24 FE now uses the Super HDR feature; In the sense that the brightness of the display is adjusted based on the bright and dark points of the photo independently of the brightness of the device itself, and like HDR videos, you can watch the photos more vividly and brighter than before. In addition to the promise of 7 years of updating the operating system, the S24 FE has another feature in common with this year’s Samsung flagships, which can encourage any user to buy it.

Galaxy S24 FE camera: more or less better and sometimes worse

For the S24 FE camera, the Koreans have not only adopted their usual procedure of using a triple combination of wide, ultra-wide, and telephoto, but they have not touched the sensors and lenses except for the telephoto camera. As before, the same 50-megapixel sensor used in the previous generation as well as the S22 to S24 phones, and the 12-megapixel sensor of the S23 FE ultrawide camera are present in this phone.

Photography with Galaxy S24 FE

The telephoto camera uses an 8-megapixel sensor made by OmniVision instead of SK Hynix, which is not particularly different in terms of sensor dimensions and pixels, and offers three times magnification; Therefore, except for the role of the image signal processor in processing photos, no other factor can be considered to be involved in the difference in the output of the images.

camera

Sensor

Lens

Color filter

capabilities

Wide camera (main)

50 megapixels

Dimensions 1/1.57 inches

1.0 µm pixels

Dual Pixel phase detection autofocus

23 mm

Aperture f/1.8

Optical stabilizer

Tetrapixel

4K@30/60/120fps video recording

8K@30fps

1080p@30/60/120/240fps

HDR10+

Ultrawide camera

12 megapixels

Dimensions 1/3 inch

1.12 µm pixels

13 mm

Aperture f/2.2

No anti-vibration

RGB Bayer

Telephoto camera

8 megapixels

Dimensions 1/4.4 inches

1.0 µm pixels

Phase detection autofocus

76 mm (3x magnification)

Aperture f/2.4

Optical stabilizer

RGB Bayer

selfie camera

10 megapixels

Dimensions 1/3 inch

1.22 µm pixels

25 mm

Aperture f/2.4

Image-gyroscopic electronic stabilizer

RGB Bayer

filming

4K@30/60fps

1080p@30/60fps

Samsung says that the S24 FE uses the ProVisual Engine to improve the quality of photos and remove noise and blurring caused by vibration, especially in the dark, and also prevents the loss of image quality in digital zooms. Also, thanks to the more powerful chip, it is now possible to record 8K videos at 30 fps and even 4K slow-motion video at 120 fps.

Galaxy S24 FE photo gallery in daylight

Orange motorcycle in the alley
Black platform in the park
Children's play equipment in the park
Green bench in the park
Cat sitting on the edge of the pond
Red flower close up
Some big and small pots
A close up of a pink flower

In general, the photos of S24 FE cameras are not much different from their previous generation, however, in some scenarios, this phone performs better, and in some other situations, the opposite of this happens.

Read more: Samsung Galaxy Z Flip 6 review

Galaxy S24 FE camera comparison with S23 FE

In this photo taken with the main camera, the Galaxy S23 FE is better in terms of dynamic range, and the S24 FE failed to establish a better balance between the shadows and highlights, and as a result, the details of the bright spots were burned; On the other hand, the yellowness of S24 FE photos compared to its previous generation is clearly visible.

The highway is full of cars
Wide picture S23 FE
A view of the busy highway
Wide picture S24 FE
A building with a stone facade
S23 FE wide photo crop
A building with a white facade
S24 FE wide photo crop

In the examples below, the S24 FE ultrawide camera has better color and surface and has more contrast. It also appears better than the S23 FE in terms of detail and dynamic range. If you pay attention to the corners of the image, the distortion caused by the wide viewing angle of the camera is less.

City highway view
S23 FE ultrawide photo
Highway view from the flyover
S24 FE ultrawide photo
Brick residential building
Ultrawide photo crop S23 FE
A building with a brick facade
S24 FE ultrawide photo cropping

The telephoto cameras of both phones are very close in performance; However, in some conditions, the Galaxy S24 can record more colorful, glossy, and contrasty photos; However, it still shows weakness in terms of dynamic range compared to the Galaxy S23 FE and is unable to extract details from bright spots.

A view of trees and buildings
S23 FE telephoto photo
Urban buildings
S24 FE telephoto photo
A crowded roof of a building
S23 FE telephoto crop
Building roof with roof garden
S24 FE telephoto crop

To compare the details, I took pictures of both phones in 50-megapixel mode, and as you can see, the S24 FE was able to extract more details from the image; But this work comes at the cost of increasing the noise and over-sharpening of the images and makes it out of the natural state.

Kerami building facade
50 megapixel S23 FE photo
Stone facade of a building
50 megapixel S24 FE photo
Creamy stone texture
S23 FE 50-megapixel photo crop
Stone texture
S24 FE 50-megapixel photo crop

In the portrait images, we can also see that although the S24 FE photo may look more desirable, the Galaxy S23 FE has clearly depicted the shadows and highlights better and has not manipulated the light reflection in order to capture the skin tone closer to reality and, as a result, a more natural photo. .

A person in a brown shirt
Portrait photo S23 FE
A boy with glasses
Portrait photo S24 FE

In the dark, since the Galaxy S24 FE can capture more stable photos, instead of increasing the sensitivity, it slows down the shutter speed in order to capture less noisy photos. For this reason, details in shadows and darker areas may not be as discernible as on the S23 FE; But instead, less processing is applied to the photo, and therefore a more realistic photo can be seen.

A crowded table in a dark room
Wide picture S23 FE
Small vase on the table
S23 FE wide photo crop
Vase on the book on the table
S23 FE telephoto photo
White table in purple room
Wide picture S24 FE
Vase and book on the table
S24 FE wide photo crop
Vase on the book
S24 FE telephoto photo

In terms of details, the S24 FE excels in all cameras; If you pay attention to the ultrawide photo, the texture of the subject is destroyed due to software processing in Samsung’s veteran fan edition, while the details are clearly visible in the new generation.

 Photo gallery at night

Building lighting at night
A large white building
A view of the highway at night
A view of the lake and buildings at night
A corridor with a few trees at night
Windshield of a building
A building with a modern facade at night
A view of an amusement park at night

The selfie camera of the S24 FE is the same 10 megapixels as before and has not made any difference; However, surprisingly, the previous generation captures more vivid and high-contrast photos; Of course, in terms of details, there is no particular difference between them.

A standing person in a brown t-shirt
S23 FE selfie
A man with glasses and a brown T-shirt
S24 FE selfie

 A product still involved with constant challenges

The one-year absence of the Exynos gave a second chance to the Korean fan edition to experience a significant upgrade in terms of hardware; So, upgrading from S23 FE to this phone can be seen as more logical than upgrading from S21 FE to S23 FE. Longer battery life and seven-year software support, along with artificial intelligence capabilities, are two other important factors that bring the previous generations of the FE series to their knees against this phone.

If we refer to the question at the beginning of the article, we can see that the problem of performance loss and heat that have always plagued the FE series has not been solved this year, and even the weakened version of the Exynos 2400 could not help it. It seems that this issue will continue to be the killer of Korean flagships until Samsung gives up Exynos.

Continue Reading

Technology

Everything you need to know about GPU

Published

on

By

GPU
What is the role of the graphics processing unit (GPU) in computers and smart devices? What are its components and how is the image transferred to the screen?

What is a graphics processor? Everything you need to know about GPU

The graphics processing unit (GPU) is a specialized electronic circuit for managing and changing memory to accelerate the creation and display of output images on the monitor . The graphic processor consists of a number of basic graphics operators, which in their most basic state are used to draw rectangles, triangles, circles, and arcs, and are much faster than processors in creating images.

Table of contents
  • What is a graphics processor (GPU)?
  • 3D image
  • Bitmapped graphics (BMP.)
  • Vector graphics
  • rendering
  • Graphics API
  • What is GDDR?
  • History of 3D graphics
  • How to produce 3D graphics
  • 3D modeling
  • Layout and animation
  • rendering
  • Shading technique
  • Pixel shaders
  • Vertex shaders
  • Difference between GPU and CPU
  • Familiarity with GPU architecture
  • Tensor kernels
  • Ray tracing engine
  • What is GPGPU?
  • What is CUDA?
  • Advantages of CUDA kernels
  • Disadvantages of CUDA kernels
  • OpenCL; CUDA replacement
  • CUDA and OpenCL vs. OpenGL
  • OpenCL or CUDA
  • The most prominent brands
  • Intel
  • Nvidia
  • AMD
  • The difference between graphics processor and graphics card
  • Graphics card components
  • Video memory
  • printed circuit board
  • Display connectors
  • bridge
  • Graphic interface
  • Voltage regulator circuit
  • Cooling system
  • Types of graphics processor
  • iGPU
  • dGPU
  • Cloud GPU
  • eGPU
  • Mobile GPU
  • Types of mobile GPUs
  • Other applications of graphics processors
  • Video editing
  • 3D graphics rendering
  • Learning the machine
  • Blockchain and digital currency mining

In fact, what we see on the screens is the output of the graphics processors that are used in many systems such as phones, computers, workstations, and game consoles.

What is a graphics processor (GPU)?

If we consider the central processing unit or processor (CPU) as the brain of the computer that manages all calculations and logical instructions, the graphics processing unit or graphics processor (GPU) can be considered as a unit for managing the visual and graphical output of calculations, instructions and information. Related to images, it is known that their parallel structure works more optimally than central processing units or processors for processing large blocks of data; In fact, GPU is considered a graphic interface for converting the calculations made by the processor into a form that is understandable for the user, and it can be safely said that any device that somehow displays graphic output is equipped with some kind of graphics processor.

The graphics processing unit in a computer can be embedded on the graphics card or on the motherboard, or come with the processor on an integrated chip (for example, AMD APUs). It is also possible to identify the graphics card model in Windows with the fastest method, just refer to the linked article and read it.

Integrated chips cannot produce such impressive graphics output, and their output will definitely not satisfy any gamer; To benefit from higher quality visual effects, a separate graphics card (we will learn more about the differences between graphics processor and graphics card) with capabilities beyond a simple graphics processor should be prepared. In the following, we will briefly familiarize ourselves with some basic concepts used in the discussion of graphics.

3D image

An image that has depth in addition to length and width is called a three-dimensional image, which conveys more concepts to the audience and has more information compared to two-dimensional images. For example, if you look at a triangle, you will only see three lines and three angles, but if you have a pyramid-shaped object, you will see a three-dimensional structure consisting of four triangles, five lines, and six angles.

Bitmapped graphics (BMP.)

Bitmapped graphics, or rasterized graphics, is a digital image in which each pixel is represented by a number of bits; This graphic is made by dividing the image into small squares or pixels, each of which contains information such as transparency and color control; Therefore, in raster graphics, each pixel corresponds to a calculated and predetermined value that can be specified with great precision.

Bitmapped graphics

The image resolution of raster graphics is dependent on the resolution of the image , which means that the scale of the images produced with this graphic cannot be increased without losing the appearance quality.

Vector graphics

A vector graphic (ai. or eps. or pdf. or svg. formats) is also an image that creates paths with start and endpoints. These routes are all based on mathematical expressions and consist of basic geometric shapes such as lines, polygons, and curves. The main advantage of using vector graphics instead of bitmapped graphics is their ability to scale without losing quality. The scale of the images produced with vector graphics can be easily increased, without loss of quality and as much as the capability of the device that renders them.

As mentioned, unlike vector graphics that are scaled to any size with the help of mathematical formulas, bitmapped graphics lose their quality by scaling . The pixels of a bitmapped graphic must be interpolated when upscaling, which blurs the image and must be resampled when downscaling, which causes loss of image data.

Vector graphics vs bitmapped graphics

In general, vector graphics are best for creating works of art consisting of geometric shapes, such as logos or digital maps, typefaces, or graphic designs, while raster graphics deal more with real photos and images and are suitable for photographic images.

Vector graphics can be used to make banners or logos; Because with this method, images are displayed in both small and large dimensions with the same quality. One of the most popular programs used to view and create vector images is Adobe Illustrator.

Rendering

The process of producing 3D images from software based on computational models and displaying it as an output on a 2D screen is called rendering.

Graphics API

Software programming interface (Application Programming Interface) or API is a protocol for communication between different parts of computer programs and is considered an important tool for software interaction with graphic hardware; This protocol may be based on the web, operating system, data center, hardware or software libraries. Today, many tools and software have been developed for imaging and rendering of 3D models, and one of the important uses of graphic APIs is to make the process of imaging and rendering easier for developers. In fact, graphics APIs provide virtual access to some platforms for the developers of their graphics applications and their testing. In the following, we introduce some of the most well-known graphic APIs:

OpenGL (short for Open Graphics Library) is a library of various functions for drawing 3D images, which is a cross-platform standard and application programming interface (API) for 2D and 3D graphics and rendering, and a graphics accelerator in video games, design, virtual reality, and other applications. is considered This library has more than 250 different calling functions for drawing 3D images and is designed in two types: Microsoft (often in Windows or graphics card installation software) and Cosmo (for systems that do not have a graphics accelerator).

opengl

The OpenGL graphic interface was first designed by Silicon Graphics in 1991 and was released in 1992; The latest version of this API, OpenGL 4.6, was also introduced in July 2017.

A set of application programming interfaces (APIs) developed by Microsoft to enable the communication of instructions with audio and video hardware. Games equipped with DirectX have the ability to use multimedia features and graphics accelerators more efficiently and have improved overall performance.

When Microsoft was preparing to release Windows 95 in late 1994, Alex St. John, a Microsoft employee, researched the development of games compatible with MS-DOS. The programmers of these games often rejected the possibility of porting them to Windows 95 and considered it difficult to develop games for the Windows environment. For this purpose, a three-person team was formed and within four months, this team was able to develop the first set of application programming interfaces (API) called DirectX to solve this problem.

The first version of DirectX was released in September 1995 as the Windows Games SDK, replacing the Win32 DCI and WinG APIs for Windows 3.1. DirectX for Windows 95 and all subsequent versions of Microsoft Windows allowed them to host high-performance multimedia content.

Microsoft offered to John Carmack, the developer of Doom and Doom 2 games, to transfer these two games from MS-DOS to Windows 95 for free with DirectX, in order to increase the acceptance of DirectX by developers. also save the game. Carmack agreed, and the first version of the game, Doom 95, was released in August 1996 as the first game developed on DirectX. DirectX 2.0 became a part of Windows itself with the release of the next version of Windows 95 and Windows NT 4.0 in mid-1996.

Since at that time, Windows 95 was still in its infancy and there were few published games for it, Microsoft began to promote this programming interface extensively and during an event for the first time, Direct3D and DirectPlay were introduced in the online demo of the MechWarrior 2 multiplayer game. Did the DirectX development team face the challenge of testing each version of this programming interface for each set of computer hardware and software, as well as different graphics cards, sound cards, motherboards, processors, inputs, games, and other multimedia applications with each beta version? The final ones were tested and even tests were produced and distributed so that the hardware industry could check the compatibility of their new designs and driver versions with DirectX.

The latest version of DirectX, namely DirectX 12, was unveiled in 2014, and a year later, it was officially released along with Windows 10. This graphics API supports a special multiple adapter and allows the simultaneous use of multiple graphics on a system.

Before DirectX, Microsoft included OpenGL in its Windows NT platform, and now Direct3D was supposed to be an alternative to Microsoft-controlled OpenGL, which was initially focused on gaming. During this time, OpenGL was also developed and it better supported programming techniques for interactive multimedia programs such as games, but since OpenGL was supported by Microsoft’s DirectX team, it gradually withdrew from the competition.

Vulkan

Vulkan is a low-cost, cross-platform graphics API for graphics applications such as gaming and content creation. The distinguishing feature of this graphic API with DirectX and OpenGL is its ability to render 2D graphics and consume less power.

At first, many thought that Vulkan could be the next improved OpenGL and the continuation of its path, but the passage of time has shown that this prediction was not correct. The following table shows the performance differences of these two graphics APIs.

OpenGL

Vulkan

It has only one global state machine

It is object-based and lacks a global state

state is limited to only one content

The concept of all states is placed in the command buffer

Functions are only performed sequentially

It has multi-threaded programming capability

Memory and GPU synchronization are usually hidden

It is possible to control and manage synchronization and memory

Error checking is done continuously

Drivers do not perform error checking at runtime.

Instead, there is a validation layer for developers.

Mantle

The Mantle Graphics API is a low-cost interface for rendering 3D video games. It was first developed by AMD and video game developer DICE in 2013. The partnership was intended to compete with Direct3D and OpenGL on home computers, however, Mantle was officially discontinued in 2019 and replaced by the Vulkan graphics API. Mantle could optimally reduce the workload of the processor and eliminate the nodes created in the processing process.

Metal

Metal is Apple’s proprietary graphical interface written in C++ language and first used in iOS 8. Metal can be seen as a combination of OpenGL graphic interface and OpenCL framework, which was designed to simulate the graphic APIs of other platforms such as Vulkan and DirectX 12 for iOS, Mac and tvOS. In 2017, the second version of the Metal graphics API was released with support for macOS High Sierra, iOS 11 and tvOS 11 operating systems. Compared to the previous version, this version was more efficient and optimized.

What is GDDR?

The DDR memory in the graphics processing unit is called GDDR or GPU RAM . DDR (short for Double Data Rate) is an advanced version of Dynamic Simultaneous RAM (SDRAM) and uses the same frequencies as it. The difference between DDR and SDRAM is the number of times data is sent per cycle; DDR transfers data twice per cycle, doubling the memory speed, while SDRAM transfers signals only once per cycle. DDRs quickly became popular because, in addition to twice the transfer speed, they are cheaper than SDRAM and also consume less power than older SDRAM modules.

Related articles:
  • What is DDR5? Everything you need to know about the latest RAM standard [with video]

GDDR was introduced in 2006 for fast rendering on the graphics processor, compared to normal DDR, this memory has a higher frequency and less heat, and it is considered a replacement for VRAM and WRAM, which has been released for 6 generations and each generation is faster and more advanced than the generation is previous

GDDR5 is known as the previous generation of video RAM, and ten years have passed since the introduction of the last current GDDR standard (i.e. GDDR6); GDDR6 with a transfer speed of 16 GB/s (double GDDR5) and a read/write access of 32 bytes (equal to GDDR5) is used in Nvidia’s RTX30 series and AMD’s latest 6000 series graphics cards; GDDR versions do not numerically correspond to DDR, and GDDR5, like GDDR3 and GDDR4, are based on DDR3 technology, and GDDR6 is also based on DDR4 technology; In fact, it can be said that GDDR takes a relatively more independent path than DDR in terms of performance differences.

The main task of the graphics processor is to render images, however, to do this, it needs space to store the information needed to create the completed image, this graphics unit uses RAM (or random access memory) to store data; Data that includes information about each pixel associated with the image, as well as its color and location on the screen. A pixel can be defined as a physical point in a raster image that represents a dot matrix data structure of a rectangular grid of pixels. RAM can also hold completed images until it’s time to display them, which is called a frame buffer.

Before the development of graphics processors, the CPU was responsible for processing images to create output and render them; This would put a lot of pressure on the processors and slow down the system. In fact, the sparks of today’s 3D graphics were lit up with the further development of arcade games, gaming consoles, military, robotics, and space simulators, as well as medical imaging. Rendering and their applications as well as the way of naming were discussed.

In the following and before examining the history of the graphic processing unit, we will introduce the concepts that are relevant in this industry:

History of 3D graphics

The term GPU was first introduced in the 1970s as an abbreviation for Graphic Processor Unit and described a programmable processing unit that had an independent function from the central processing unit or the same processor and was responsible for setting and outputting graphics. ; Of course, at that time this term was not defined as it is today.

In 1981, IBM developed its two graphics cards for the first time, MDA (Monochrome Display Adapter) and CGA (Color Graphics Adapter). The MDA had four kilobytes of video memory and only supported text display; This graphic is no longer used today but may be found on some older systems.

IBM MDA graphics
IBM CGA graphics

CGA was also considered the first graphics for computers, which was equipped with only sixteen kilobytes of video memory and was capable of producing 16 colors with a resolution of 160 x 200 pixels. A year after this, Hercules Graphics developed the HGC graphics (Hercules Graphics Card) with 64 kilobytes of video memory, which was a combination of MDA with bitmapped graphics, to respond to IBM’s graphics cards.

In 1983, Intel entered the graphics card market with the introduction of the iSBX 275 Video Graphics Multimodule. This card could display eight colors with a resolution of 256 x 256. A year after this, IBM introduced PGC or Professional Graphic Controller and EGA or Enhanced Graphic Adapter graphics that displayed 16 colors with a resolution of 640 x 350 pixels.

The VGA or Video Graphics Array standard was introduced in 1987, this standard offered a resolution of 640 x 480 with 16 colors and up to 256 kilobytes of video memory. In the same year, ATI introduced its first VGA graphics card called ATI VGA Wonder; Some models of this graphics card were even equipped with a port for connecting a mouse. Until now, video cards had few memories, and processors transferred graphics processing to these video memories and after performing calculations and signal conversion, displayed them on the output device.

ATI VGA WONDER

After the first 3D video games were released, it was no longer possible to process graphics inputs quickly on processors; In this situation, the basic concept of a graphic processing unit was formed. This concept was initially developed with the introduction of the graphics accelerator; The graphics accelerator was used to boost system performance, perform calculations and graphic processing, and lighten the workload of the processor, and had a significant impact on computer performance, especially intensive graphics processing. In 1992, Silicon Graphics released OpenGL, the first library of various functions for drawing 3D images.

The GPU evolved from the beginning as a complement to the CPU and to lighten the workload of the unit

4 years later, Voodoo introduced its first graphics card by a company called 3dfx. This graphics was called Voodoo1 and it required the installation of a 2D graphics card to render 3D graphics, and it quickly became popular among gamers.

In 1997, in response to Voodoo, Nvidia released the RIVA 128 graphics accelerator. Like the Voodoo1, the RIVA 128 allowed video card manufacturers to use graphics accelerators along with 2D graphics, but compared to the Voodoo1, it had weaker graphics rendering.

Voodoo1
Riva 128

After the RIVA 128, 3dfx released the Voodoo2 graphics as a replacement for the Voodoo1. It was the first graphics card to support SLI, allowing two or more graphics to be connected to produce a single output. SLI or Scalable Link Interface is a brand name for an obsolete technology developed by Nvidia for parallel processing and to increase graphics processing power.

The term GPU was popularized in 1999 with the worldwide launch of the GeForce 256 as the world’s first graphics processor by Nvidia. Nvidia introduced this GPU as a single-chip processor with integrated conversion of a 2D view from a 3D scene, lighting and changing the color of surfaces, and the ability to draw parts of the image after rendering. ATI Technologies also released Radeon 9700 graphics in 2002 to compete with Nvidia with the term Visual Processing Unit or VPU.

With the passage of time and the advancement of technology, GPUs were equipped with programmable capabilities, which made Nvidia and ATI enter the competition scene and introduce their first graphics processors (GeForce for Nvidia and Radeon for ATI).

GeForce 256

Nvidia officially entered the graphics card market in 1999 with the release of GeForce 256 graphics. This graphics card is known to be the world’s first true GPU that had 32 MB of DDR (same as GDDR) memory and fully supported DirectX 7.

Along with the efforts to speed up the calculations and graphics processing of computers and improve their quality, the companies that produce video games and gaming consoles also tried in some way (Sega with Dreamcast, Sony with PS1, and Nintendo with Nintendo 64) in this field.

How to produce 3D graphics

The process of producing 3D graphics is divided into three main stages:

3D modeling

The process of developing an array based on mathematical coordinates of a physical surface or surface (inanimate or animate) in 3D form is done through specialized software by manipulating sides, vertices, and polygons that are simulated in 3D space.

Physical objects are represented using a set of points in three-dimensional space, which are connected by various geometric elements such as triangles, lines, curved surfaces, etc. Basically, 3D models are first created by connecting points and forming polygons. A polygon is an area that consists of at least three vertices (triangles) and the overall integrity of the model and its suitability for use in animation depends on the structure of these polygons.

Three-dimensional models (3D) are made from two methods of polygon modeling (Vertex) and by connecting grid lines of vectors or curve modeling (Pixel) by weighting each point; Today, due to greater flexibility and the possibility of faster rendering of the 3D modeling process in the first method, the vast majority of 3D models are produced in a polygonal and textured way. One of the main tasks of graphics cards is texture mapping (Texture Mapping), which adds texture to an image or 3D model. For example, adding a stone texture to a model makes it look like a real stone image, or adding a texture that resembles a human face to design a face for a scanned 3D model.

Texture mapping

In the second method, modeling with weighted control of curved points is obtained, of course, the points are not interpolated, but only curved surfaces can be created using the relative increase of polygons. In this method, increasing the weight for a point brings the curve closer to that point.

Layout and animation

After modeling, it should be determined how to place and determine the movement of objects (models, lights, etc.) in a scene before rendering the objects and creating the image; This means that before the images are rendered, the objects must be designed and arranged in the scene. In fact, by defining the location and size of each object, the spatial relationship between the objects is formed. Motion or animation also refers to the temporal description of an object ( how it moves and changes shape over time ). Common layout and animation methods include keyframing, reverse kinematics, and motion capture. Of course, these techniques are often used in combination.

Rendering

In the last stage, based on the way the light is placed, the types of surfaces, and other specified factors, computer calculations are performed to produce and pay for the image. In this section, materials and textures are the data used for rendering.

The amount of light transmission from one surface to another and the amount of its distribution and interaction on surfaces are two basic actions in rendering that are often implemented using 3D graphics software. In fact, rendering is the final process of creating a 2D image or animation from a 3D model and a ready-made scene with the help of several different and often specialized methods that may take only a fraction of a second or sometimes up to several days for a single image/frame.

Shading technique

After the development of graphics processors to reduce the workload of processors and provide a platform for producing images with much more impressive quality than before, Nvidia and ATI gradually became the main players in the world of computer graphics. These two competitors worked hard to outdo each other and each tried to compete by increasing the number of levels in modeling and rendering and improving techniques. The shading technique can be seen as the birth of their competition.

In the computer graphics industry, shading refers to the process of changing the color of an object/surface/polygon in a 3D scene based on things like its distance from the light, its angle to the light, or the surface’s angle to the light.

Shaders calculate the appropriate levels of light, dark, and color while rendering a 3D scene.

Shading during the rendering process is done by a program called Shader, which calculates the appropriate levels of light, dark, and color during the rendering of a 3D scene. In fact, shaders have evolved to perform a variety of specialized functions in graphic effects, video post-processing, as well as general-purpose computing on GPUs.

Shader changes the color of surfaces in a 3D model based on the angle of the surface to the light source or light sources.

  • In the first image that you can see below, all the surfaces of the box are rendered with one color and only the edge lines are marked to make the image better visible.
  • The second image shows the same model without the edge lines; In this case, it is a bit difficult to recognize where one face of the box ends and then starts again.
  • In the third image, the shading technique has been applied; The final image looks more realistic and the surfaces are easier to recognize.
The first step of shading

Shaders are widely used in cinema processing, computer graphics, and video games to produce a wide range of effects. Shaders are simple programs that describe a vertex or pixel. Vertex shaders describe properties such as position, texture coordinates, colors, etc. of each vertex, while pixel shaders describe the color, z-depth, and alpha value properties of each pixel.

There are three types of shaders in common use (pixel, vertex, and geometric shaders). Older graphics cards use separate processing units for each shader, but newer cards are equipped with integrated shaders that can run any technique and provide more optimized processing.

Pixel shaders

Pixel shaders calculate and render the color and other properties of each pixel region. The simplest types of pixel shaders produce only one screen pixel as the output color. In addition to simple lighting models, pixel shaders provide more complex outputs such as color space change, color saturation, brightness (HSL/HSV) or image contrast, blur generation, light bloom, volumetric lighting, normal mapping (for depth effect), bokeh, cell shading, They may also include posterization, bump mapping, distortion, blue screen or green screen effects, edge highlighting and motion, and simulating psychedelic effects.

Of course, in 3D graphics, the pixel shader alone cannot create complex effects, because it only works on one area and does not have access to the information about the vertices, but if the contents of the entire screen are passed to the shader as a texture, these shaders can use the screen and pixels Sample around and enable a wide range of 2D post-processing effects such as blur or edge detection/enhancement for shaders.

Vertex shaders

Vertex shaders are the most common type of 3D shaders and run once on each vertex given to the GPU. The purpose of using these shaders is to convert the three-dimensional position of each vertex in the virtual space into two-dimensional coordinates for display on the monitor. Vertex shaders can manipulate properties such as position coordinates, color, and texture, but cannot create new vertices.

Shaders needed parallelization to perform calculations and render quickly; The concept of crisp or thread was born from here

In 2000, ATI introduced the Radeon R100 series of graphics cards and with this work launched a lasting legacy of the Radeon series of graphics cards. The first Radeon graphics cards were fully compatible with DirectX 7 and used ATI’s HyperZ technology, which actually uses three technologies: Z compression, Z fast cleanup, and Z hierarchical buffering to conserve more bandwidth and improve rendering efficiency.

In 2001, Nvidia released the GeForce 3 graphics card series; This series was the first graphics card in the world to have programmable pixel shaders. Five years after this incident, ATI was bought by AMD, and since then the Radeon series of graphics cards has been sold under the AMD brand. Shader programs needed parallelization for fast calculations and renderings. To solve this problem, Nvidia proposed the concept of crisp or the same thread for graphics processors, which we will explain more about in the following.

Difference between GPU and CPU

The GPU evolved from the beginning as a complement to the CPU and to lighten the workload of the unit. Today, the performance of processors is becoming more powerful with new achievements in their architecture, increasing the frequency and number of cores, while GPUs are specifically developed to speed up graphics processing.

Processors are programmed in such a way that they can switch between operations very quickly in addition to doing one task with the lowest delay and highest speed. In fact, the processing method in CPUs is serial.

On the other hand, the graphics processor has been specifically developed to optimize the performance of graphics processing and provides the possibility of doing things simultaneously and in parallel. In the image below, you can see the number of cores of a processor and the number of cores of a graphics processor; This image shows that the main difference between CPU and GPU is the number of cores they have to process a task.

CPU vs GPU

From the comparison of the overall architecture of the processors and graphics cards, we can find many similarities between these two units. Both use similar structures in the cache layers and both use a controller for memory and a main RAM. An overview of modern processor architecture suggests that low-latency memory access is the most important factor in processor design, with a focus on memory and cache layers (exact layout depends on vendor and processor model).

Each processor consists of several cache layers:

  • Level one cache memory (L1) is the fastest, smallest, and closest memory to the processor and stores the most important data needed for processing.
  • The next layer is the level two cache memory (L2) or the external cache memory, which is slower and larger than L1.
  • The L3 cache memory in the processor is shared by all cores, and in terms of capacity, it has a larger volume and lower speed than the L1 and L2 cache memory; Like L3, L4 cache has a larger volume and lower speed than L1 and L2; The two are usually used interchangeably. If the data is not located in the cache layers, it is called from the main RAM (DDR).

Looking at the general overview of the GPU architecture (the exact layout depends on the manufacturer and model), we can see that the nature of this unit is focused on running the available cores instead of quickly accessing the cache memory or reducing latency. In fact, the GPU consists of several groups of cores that are located in the level one cache memory.

Compared to the processor, the graphics processor has fewer layers of cache memory and less capacity, this unit is equipped with more transistors dedicated to calculations and cares less about data recovery from memory; The graphics processor is developed with the approach of doing parallel calculations.

High-performance computing is one of the effective and reliable uses of parallel processing to run advanced applications; Precisely for this reason, GPUs are suitable for this kind of calculations.

In simple terms, let’s say you have two options for doing some kind of heavy computation:

  • Using a small number of powerful cores that perform processes serially.
  • Using a high number of not-so-powerful cores that can perform several processes simultaneously.

In the first scenario, if we lose one of the cores, we will face a serious problem; The performance of the other two cores will be affected and the processing power will be greatly reduced, on the other hand, if we lose a core in the second scenario, there will be no noticeable change in the processing process and the rest of the cores will continue to work.

The GPU performs several tasks at the same time and the CPU performs one task at a very high speed

The bandwidth of the GPU is much higher than the bandwidth of the processor, and therefore it performs parallel processing with high volume much better. The most important issue about graphics processors is that this processing unit is developed for parallel processing, and if the algorithm or calculations are secret and do not have parallelization capabilities, they are not executed at all and cause the system to be slow. CPU cores are more powerful than GPU cores and the bandwidth of this unit is much less than GPU bandwidth.

Familiarity with GPU architecture

At first glance, the CPU has larger but fewer computing units than the GPU. Of course, keep in mind that a core in the processor works faster and smarter than a core in the GPU.

Over time, the frequency of processor cores has gradually increased to improve performance, and on the contrary, the frequency of GPU cores has been reduced to optimize consumption and accommodate installation in phones or other devices.

The ability to perform processes irregularly can be seen as proof of the intelligence of the processor cores. As mentioned, the central processing unit can execute the instructions in a different order than the one defined for it, or predict the instructions needed in the near future and prepare the operands to optimize the system as much as possible and save time, before execution.

In contrast, the core of a graphics processor is not responsible for complexity and does not do much for processing outside of instructions and programs. In general, the main specialization of GPU cores was to perform floating-point operations such as multiplying two numbers and adding a third number (A x B + C = Result) by rounding the result to an integer, which is called multiply-add or MAD for short. It uses the same result with full accuracy (without truncation) in the multiplication stage, which is called Fused Multiplay-Add or FMA.

The latest GPU microarchitectures today are no longer limited to FMA and perform more complex operations such as ray tracing or tensor kernel processing. Tensor cores and ray tracing cores are also designed to provide hyper-realistic renderings.

operands

Tensor kernels

In 2020, Nvidia produced graphics processors equipped with additional cores that, in addition to shaders, were also used for artificial intelligence, deep learning, and neural network processing. These kernels are called Tensor. Tensor is a mathematical concept whose smallest imaginable unit has zero dimension (zero-by-zero structure) and contains only one value. By increasing the number of dimensions, other tensor structures are:

  • One-dimensional tensor: vector (Vector with zero-in-one structure)
  • Two-dimensional tensor: matrix (Matrix with one-in-one structure)

Tensor cores fall into the category of SIMD or “single instruction for multiple data” and their use in GPUs provides a much smarter chip than a calculator for graphics by providing all the computing and parallel processing needs. In 2017, Nvidia introduced graphics with a completely new architecture called Volta, which was designed and built targeting professional markets; This graphics card was equipped with cores for tensor calculations, but GeForce graphics processors did not use it.

At that time, the tensor cores were capable of multiplying decimal numbers up to 16-bit dimensions (FP16) and addition with 32-bit dimensions (FP32). Less than a year later, Nvidia introduced the Turing architecture; The only difference from the previous architecture was providing support for tensor cores for GeForce GPUs and data formats such as eight-bit integers.

In 2020, Ampere architecture was introduced in A100 graphics processors for data centers; In this architecture, the efficiency and power of cores have increased, the number of operations per cycle has quadrupled, and new data formats have been added to the supported set. Today, tensor cores are specialized and limited pieces of hardware that are used in a small number of consumer-specific graphics. Intel and AMD (the other two players in the world of computer graphics) do not have tensor cores in their GPUs, But maybe they will offer similar technology in the future.

  • Tensor kernels are widely used in physics engineering and mathematics: they can perform complex calculations in electromagnetism, astronomy, and fluid mechanics.
  • Tensor cores can increase the resolution of images: these cores extract images at a lower graphics level (or lower resolution) and increase the quality of images after rendering.
  • Tensor cores increase frame rate: Tensor cores can increase frame rate in games after enabling ray tracing in games.

Ray tracing engine

In addition to cores and cache layers, GPUs may also include hardware to accelerate ray tracing, which simulates a light source shining on objects and creates different zoning in terms of light radiation. Fast ray tracing in video games can display more realistic and high-quality images.

Ray tracing is one of the biggest advancements in recent years in computer graphics and the gaming industry. At first, this feature was only used in the film industry, computer image production, and in animation and visual effects, but today PS5 and XBOX X series gaming consoles also support ray tracing.

In the real world, everything we see is the result of light hitting objects and reflecting it to our eyes; Ray tracing does the same thing in reverse and by identifying the light sources, the path of the light rays, the material, the type of shadow and the amount of reflection when it hits the objects. The ray tracing algorithm displays the reflection of light from objects of different genders in different and more realistic forms, draws the shadow of the objects that are in the path of the light beam depending on whether they are transparent or semi-transparent, and follows the laws of physics. For this reason, the images produced with this feature are very close to reality.

On beam tracking engine (left image) vs off beam tracking engine (right side)

Off beam tracking engine (right side) versus on beam tracking engine (left side)

Nvidia first released ray tracing in 2018 on RTX series graphics under the Turing architecture, and then introduced a new driver that added ray tracing support to some GTX series graphics, which performed less well than the GTX series. They have RTX.

AMD also introduced ray tracing to the PS5 and Xbox XS series consoles by introducing the RDNA 2 architecture. Activating this feature in games reduces the frame rate due to the heavy processing load; For example, if a game runs at 60 fps on a system in normal mode, it may provide only 30 fps with ray tracing.

Frame rate, measured in frames per second (FPS), is a good measure of GPU performance, indicating the number of completed images that can be displayed per second; For comparison, the human eye can process about 25 frames per second, however fast action games need to process at least 60 frames per second to render a game stream smoothly.

What is GPGPU?

Many users abused the ability of parallel and fast processing of graphics processors in some way and transferred processes with the possibility of parallel calculations to this unit without considering the traditional task of the graphics processor. GPGPU or general purpose graphics processor was the solution that Nvidia introduced to solve this problem.

GPGPU (abbreviation of General Purpose Graphics Processing Unit) is the graphics processing unit that also performs non-specialized calculations (or CPU tasks) .

In fact, GPGPUs are used to perform tasks that were previously performed by high-powered CPUs, such as physics calculations, encryption/decryption, scientific computing, and the generation of digital currencies such as Bitcoin. Since GPUs are built for massive parallelism, they can reduce the computational burden on the most powerful processors. That is, the same cores used to shade multiple pixels simultaneously can similarly process multiple data streams simultaneously. Of course, these cores are not as complex as processor cores.

The GeForce 3 was Nvidia’s first GPU to feature programmable shaders. At the time, programmers aimed to make rasterized or bitmapped 3D graphics more realistic, and this Nvidia GPU provided capabilities such as 3D transformation, roughness mapping, and lighting calculations.

After the GeForce 3, the ATI 9700 GPU was introduced, equipped with DirectX 9, with more programming capabilities like the CPUs. With the introduction of Windows Vista, along with DirectX 10, integrated shader cores became standard. This newly discovered capability of GPUs enabled more CPU-based computing.

Since the release of DirectX 10, which featured integrated shaders for Windows Vista, more focus has been placed on GPGPUs, and higher-level languages ​​have been developed to facilitate programming for computations on GPUs. AMD and Nvidia both had approaches to GPGPU development with programming interfaces (open source OpenCL and Nvidia’s CUDA).

What is CUDA?

Simply put, CUDA allows programs to use the GPU as a sub-processor. The processor transfers certain tasks to the graphics equipped with the CUDA core, which is optimized to process and calculate things like lighting, motion, and interaction as quickly as possible, and even performs processing from multiple paths simultaneously when necessary. The processed data is then sent back to the processor, which uses it for larger and more important calculations.

Advantages of CUDA kernels

Computer systems are based on software, so most of the processing must be programmed in program code, and since the main function of CUDA lies in calculation, data generation, and image manipulation, using CUDA cores helps programmers save time processing effects, rendering, and Reduce outputs to a high degree, especially in changes of scales, as well as simulations such as fluid dynamics and forecasting processes. CUDA also works great in light sources and ray tracing, and functions like rendering effects, encoding, video conversion, etc. are processed much faster with its help.

CUDA kernels

CUDA is designed to work with programming languages ​​such as C, C++, and Fortran, making it easier for experts in parallel programming to use the GPU. In contrast, previous APIs such as Direct3D and OpenGL required advanced skills in graphics programming.

This design is more efficient than CPUs for parallel processing of large blocks of data, as in the following examples:

  • Cryptographic hash functions
  • machine learning
  • Molecular dynamics simulation
  • Physics engines
  • Sorting algorithms
  • Programming skills

Disadvantages of CUDA kernels

CUDA is Nvidia’s proprietary approach to introducing a GPU-like graphics processor (GPGPU), so you should only use Nvidia’s products to take advantage of it. For example, if you have a Mac Pro, you cannot use the capabilities of CUDA kernels, because this device uses AMD graphics for graphics processing; Additionally, fewer applications support CUDA than its alternative.

OpenCL; CUDA replacement

OpenCL is a relatively new, text-based system that is considered a replacement for CUDA. Anyone can use the functionality of this standard in their hardware or software without paying for the technology or a proprietary license. CUDA uses the graphics as a co-processor, while OpenCL transfers data entirely and uses the graphics more as a discrete processor. This difference in the way graphics are used may not be accurately measured, but another measurable difference between these two standards can be seen as the difficulty of coding for OpenCL compared to CUDA; As a user, you are not tied to any vendor and support is so extensive that most apps don’t even mention accepting it.

OpenCL

CUDA and OpenCL vs. OpenGL

As mentioned before, OpenGL can be seen as the beginning of the story of this competition; Of course, the purpose of developing this programming interface is not to use graphics as a general-purpose processor, and instead it is simply used to draw pixels or vertices on the screen. OpenGL is a system that allows graphics to render 2D and 3D images much faster than a CPU. Just as CUDA and OpenCL are alternatives to each other, OpenGL is an alternative to systems like DirectX on Windows.

Simply put, OpenGL renders images very quickly, and OpenCL and CUDA handle the necessary calculations when videos interact with effects and other media; OpenGL may place content in the editing interface and render it, but when it comes to color correction for this content, CUDA or OpenCL will do the math to change the pixels. Both OpenCL and CUDA can use the OpenGL system, and a graphics-equipped system with the latest OpenGL support will always be faster than a computer with a CPU and integrated graphics.

OpenCL or CUDA

The main difference between CUDA and OpenCL is the specificity of the CUDA framework, which was created by Nvidia and is open source compared to OpenCL. Assuming that the software and system hardware support both options, it is recommended to use CUDA if you have Nvidia graphics; This standard works faster than OpenCL in most cases. In addition, Nvidia graphics also support OpenCL, although the productivity of AMD graphics is higher than OpenCL. The choice between CUDA or OpenCL depends on the needs of the individual, the type of work, the type of system the workload, and its performance.

For example, Adobe explains on its website that, with very few exceptions, everything that CUDA does for Premiere Pro can also be done by OpenCL. However, most users who have compared these two standards believe that CUDA is faster than Adobe products.

The most prominent brands

In the graphics processor market, AMD and Nvidia are well-known names. The former used to be ATI and originally launched under the Radeon brand name for its GPUs in 1985; Then Nvidia became known as ATI’s competitor with the release of its first graphics processor in 1999. AMD bought ATI in 2006 and now competes with Nvidia and Intel on two different fronts. In fact, personal taste and brand loyalty are the most important factors that differentiate AMD and Nvidia.

Nvidia recently released GTX 10 series graphics, but AMD’s equivalents are generally more affordable choices. Other competitors such as Intel are also in the game and implementing their graphics solutions on the chip, but currently, AMD and Nvidia can be identified as the most prominent brands in this field. The processing speed of Nvidia graphics is lower than AMD graphics. Nvidia graphics with more cores and higher frequencies are suitable for gaming, but since they have a lower cache, they cannot reach AMD processors for performing some parallel processing such as mining and extracting digital currencies. In the following, we will briefly introduce the three leading brands in the world of graphics and graphic architecture, and in the near future, we will examine these competitors, their architectures, and products in detail in a separate article.

Intel

Intel is one of the largest manufacturers of computer equipment in the world, which operates in the field of hardware production, various types of microprocessors, semiconductors, integrated circuits, processors, and graphics processors. AMD and Nvidia are two prominent competitors of Intel and each has its own fans. Intel’s first attempt at a dedicated graphics card was the Intel740, which was released in 1998, but failed due to poorer performance than market expectations, forcing Intel to stop developing discrete graphics products. However, this graphics technology survived in the Intel Extreme Graphics product line. After this failed attempt, Intel tried its luck in the world of graphics once again in 2009 with the Larrabee architecture. This time, the previously developed technology was used in the Xeon Phi architecture.

Intel

In April 2018, news broke that Intel was assembling a team to develop discrete graphics processing units aimed at both the data center and gaming markets, bringing in Raja Kodori, former head of AMD’s Radeon Technologies Group. Intel announced early that it plans to introduce a discrete GPU in 2020. The first Xe discrete GPU, codenamed DG1, was released in October 2019 as a test drive and was expected to be used as a GPGPU for data center and self-driving applications. This product was initially made with 10nm lithography and then in 2021 with 7nm lithography and used 3D stacking packaging technology (Intel’s Foveros molding).

Intel Xe, or Xe for short, is the name of Intel’s graphics architecture, which has been used in Intel processors since the 12th generation. This company has also started developing discrete graphics and desktop graphics cards based on the Xe architecture and the Arc Alchemist brand. Xe is a family of architectures, each of which has significant differences from the others and consists of Xe-LP, Xe-HP, Xe-HPC, and Xe-HPG microarchitectures.

Unlike previous Intel GPUs that used execution units (EU) as the computing unit, Xe-HPG and Xe-HPC use Xe cores. Xe cores have vector and matrix computing logic units, and they are called vector and matrix engines, in addition to these units, they are also equipped with L1 cache memory and other hardware.

  • Xe-LP (low power): Xe-LP is a low-power variant of the Xe architecture and is used as integrated graphics in 11th-generation Intel Core processors, Iris Xe MAX mobile discrete GPUs (codenamed DG1), and H3C XG310 server GPUs ( are used with the code name SG1). This series of graphics processors offers more processing frequency with the same voltage as the previous generation. In its largest configuration, Xe-LP has 50% more execution units (EU) than the 64 execution units of the 11th-generation graphics architecture in the Icelake series, and therefore its computing resources have been significantly increased. Along with the 50% increase in execution units, Intel has improved the architecture of Xe-LP graphics processors and instead of two calculation and logic units (ALU) with four paths in the previous generation, it uses eight paths for each calculation and logic unit. In addition, in the Xe LP architecture, a level 1 cache is also added, which reduces the delay in sending data, and supports end-to-end data compression, which increases bandwidth and performs tasks such as game streaming. It speeds up video chat recording, etc.
  • Xe-HP (High Performance): Xe-HP is a high-performance, datacenter graphics optimized for FP64 performance and multi-tile scalability.
  • Xe-HPC (High-Performance Computing): Xe-HPC is the high-performance computing variant of the Xe architecture. Each core in Xe-HPC includes 8 vector engines and 8 matrix engines, along with a large 512KB L1 cache.
  • Xe-HPG (High-Performance Graphics): Xe-HPG is a high-performance graphics variant of the Xe architecture that uses the Xe-LP-based microarchitecture with improvements to Xe-HP and Xe-HPC. Xe-HPG has always been focused on graphics performance and supports hardware-accelerated ray tracing, DisplayPort 2.0, neural network-based supersampling (XeSS) similar to Nvidia’s DLSS, and DirectX 12 Ultimate. Each Xe-HPG core contains 16 vector engines and 16 matrix engines.

Nvidia

Nvidia was founded in 1993 and is one of the main manufacturers of graphics cards and graphics processors (GPU). Nvidia produces different types of graphics units, each offering unique capabilities. In the following, we briefly introduce the micro-architectures of Nvidia graphics units and the improvements of each compared to the previous generation:

  • Kelvin: The Kelvin microarchitecture was released in 2001 and was used in the GPU of the original Xbox game console. GeForce 3 and GeForce 4 series graphics units were released with this microarchitecture.
  • Rankine: Nvidia introduced the Rankine microarchitecture in 2003 as an improved version of the Kelvin microarchitecture. This microarchitecture was used in the GeForce 5 graphics series. The video memory capacity in this microarchitecture was 256 MB and it supported vertex and fragment shading programs. Vertex shaders change the geometry of the scene and create a 3D layout. Fragment shaders also specify the color of each pixel in the rendering process.
  • Curie: Curie, a microarchitecture used in GeForce 6 and 7 series graphics, was released in 2004 as a successor to Rankine. The video memory capacity in Corey was 512 MB and it was the first generation of Nvidia graphics processors that supported PureVideo video decoding.
  • Tesla: The Tesla graphics microarchitecture was introduced in 2006 and made several significant changes to Nvidia’s GPU lineup. In addition to being used in GeForce 8, 9, 100, 200, and 300 series graphics units, the Tesla architecture is also used in Quadro graphics products for things other than graphics processing. In 2020, after Elon Musk introduced the Tesla electric car, Nvidia stopped using the Tesla name to avoid further confusion.
  • Fermi: Fermi was released in 2010 and offered features such as support for 512 CUDA cores, L1 cache /shared memory partitioning, 64KB capacity for RAM, and error correcting code (ECC) support. Some GeForce 8, GeForce 500, and GeForce 400 series graphics cards were produced based on this microarchitecture.
Nvidia
  • Kepler: Kepler’s graphical microarchitecture was introduced after Fermi in 2012 and came with key improvements over the previous generation. This micro-architecture was equipped with new execution cores with simultaneous processing capability (SMX) and supported TXAA (anti-aliasing method). In the anti-aliasing technique in computer video, the information of the past frames and the current frame are combined to remove the unevenness in the current frame, and each pixel is sampled once in each frame, but in each frame, the sample is located at a different location in the pixel. Pixels sampled in past frames are combined with pixels sampled in the current frame to create a better-quality image. The Kepler microarchitecture consumes less power and the number of CUDA cores in it has increased to 1536. This micro-architecture is capable of automatic overclocking by enhancing the graphics processor and is equipped with the GPUDirect feature, which enables the communication of graphics units without the need to access the processor. Nvidia has used this micro-architecture in some GeForce 600, GeForce 700, and GeForce 800M series graphics units.
  • Maxwell: Maxwell microarchitecture was released in 2014 and the first generation of GPUs based on this microarchitecture compared to Fermi, more efficient processors as a result of improvements related to logical partitioning control, reduction of dynamic power dissipation by eliminating frequency when the circuit is not in use. Scheduling of instructions and workload balancing, 64 KB of dedicated shared memory for each execution unit, improved performance with the help of native shared memory, and support for dynamic work parallelism. Some of the GeForce 700, GeForce 800M, GeForce 900, and Quadro Mxxx series graphics units were released with the Maxwell microarchitecture.
  • Pascal: Pascal replaced the Maxwell microarchitecture in 2016. Graphics based on this microarchitecture ( GeForce 10 series ) compared to the previous generation of improvements such as NVLink communication support, for higher speed than the PCIe interface, High Bandwidth 2 (HBM2) memory equal to 720 GB, preemption processing capability ) or rollback (which by creating a temporary interruption in a running process, another process with a higher priority) and active balancing are used to optimize the use of GPU resources.
  • Volta: Volta was a unique microarchitectural iteration released in 2017. Before Volta, most previous Nvidia GPU microarchitectures were developed for general use, but Volta GPUs were perfectly suited for professional applications; In addition, Tensor Cores were also used for the first time in this micro-architecture. As mentioned earlier, tensor cores are a new type of processing cores that perform specialized mathematical calculations and matrix operations and are specifically used in artificial intelligence and deep learning. Tesla V100, Tesla V100S, Titan V and Quadro GV100 series graphics units are developed based on the Volta microarchitecture.
  • Turing: The Turing microarchitecture was introduced in 2018 and, in addition to supporting tensor cores, it also featured a number of consumer-focused GPUs. Nvidia uses this microarchitecture in its Quadro RTX and GeForce RTX series graphics processors. Turing supports RTX (Real-Time Ray Tracing) and is used for heavy calculations such as virtual reality (VR). Nvidia has used this microarchitecture for its GeForce 16, GeForce 20, Quadro RTX, and Tesla T4 graphics units.
  • Ampere: Ampere is Nvidia’s newest microarchitecture, mostly used for high-performance computing (HPC) and artificial intelligence applications. The cores in this micro-architecture are tensor-type and support the third-generation NVLink interface, structural dispersion capability (turning unnecessary parameters to zero to enable the training of artificial intelligence models), second-generation ray tracing, MIG capability (abbreviation for Multi-Instance GPU) for active Separate partitioning and performance optimization of CUDA cores are equipped. Nvidia GeForce 30 series graphics units, workstations, and data centers are developed based on this microarchitecture.

In general, Turing may be considered Nvidia’s most popular microarchitecture, because Turing’s combined ray tracing and rendering capabilities create impressive 3D animations and realistic images that are very similar to reality. According to Nvidia, Real-Time Ray Tracing in graphics units based on Turing microarchitecture can calculate a billion rays per second to create graphics images.

AMD

AMD (abbreviation for Advanced Micro Devices) was founded in 1969 and now operates as a prominent competitor to Nvidia and Intel in the field of producing processors and graphics processors. After buying ATI in 2006, this company developed the products of this brand under its brand name. AMD graphics units are produced in several different series:

  • Radeon series: common and common series that are the same ATI souvenirs.
  • Mobility Radeon Series: Includes AMD’s low-power graphics, mostly used in laptops.
  • Fire Pro series: powerful AMD graphics designed for workstations.
  • Radeon Pro series: They are known as the new generation of Fire Pro graphics.

AMD graphics used to be numbered with four digits: Radeon HD 7750.

amd

In these graphics, the bigger the model number, the stronger and newer the graphics. For example, HD 8770 graphics are stronger and more up-to-date than HD 8750; Of course, this does not apply to different generations. This means that graphics from the 7th generation cannot necessarily be compared with the graphics from the 8th generation without checking and only based on the generation number and consider it weaker. AMD changed the naming process of its products after the Radeon RX 5700 series graphics; RX 5700 XT and RX 5700 are the first graphics cards that were launched with AMD’s new naming scheme. AMD graphics units currently come in three general categories: the R5 series, the R7 series, and the R9 series.

  • R5 and R6 series: low-end and relatively weak AMD graphics.
  • R7 and R8 series: AMD’s mid-range graphics are suitable for editing in programs such as Photoshop and After Effects.
  • R9 series: AMD’s most powerful graphics belong to this family and provide acceptable performance for gaming; As far as some R9 series graphics are designed for virtual reality or VR devices.

Currently, the only active extension for AMD graphics units is the XT extension, which indicates the high-end, better performance and higher frequency of that product.

The difference between graphics processor and graphics card

Since the graphics processor is a specialized unit for processing and designing computer graphics and has been optimized for this purpose, it performs this task much more efficiently than a central processor. This chip is responsible for most in-game graphics calculations, image rendering, color management, etc., and advanced graphic techniques such as ray tracing or shading are defined for it; On the other hand, the graphics card is a physical and hardware part of a computer systems that has many electronic parts on it.

GPU on the graphics card

The production technology of graphic cards has been accompanied by many changes from the past to today, and two or three decades ago these parts were known as display cards or video cards. At that time, graphics cards did not have the sophistication of today, and the only thing they did was display images and video on the screen. With the increase in graphics capabilities and the support of cards for various hardware accelerators to provide different graphics techniques, the name of graphics card was gradually used for these parts.

Graphics card components

Today, these parts are more powerful than before and may provide the system required for different purposes such as gaming with different technologies. Therefore, all graphics cards do not have exactly the same components, but the key parts are the same. To use this piece, you will need to install the appropriate graphics driver for your graphics card on the system; This driver contains instructions on how to recognize and operate the graphics card and determines various specifications for running various games and programs. In addition to the graphics processor, the graphics card is equipped with other parts such as video memory, printed circuit board (PCB), connectors and cooling. You can get to know these parts in the next picture.

Video memory

Video memory is a place to store processed data, which is different from RAM or GDDR. This unit can be offered in different capacities depending on the use.

printed circuit board

A printed circuit board (PCB) is a board on which graphics card parts are placed and may consist of different layers. The material of these boards will be effective in the work quality of the graphic card.

Display connectors

After processing and performing calculations, the data needs cables and display connectors to be displayed on the screen. These cables use different types of connectors depending on the type of use of the product. For example, HDMI and DVI ports with more pins are used to display 4K resolution and very high frame rates, and VGA port is used to display images with lower resolutions; Today, most graphics cards have at least one HDMI port.

Graphics card components

Bridge

For some high-end graphics cards, it is possible to use them together with other high-end graphics cards. Such a feature (Bridge) is a parallel processing algorithm for computer graphics that is used to increase processing power and is shown in Nvidia graphics cards with SLI (abbreviation of Scalable Link Interface) and in AMD graphics cards with the term Crossfire.

  • SLI was first used by 3dfx in the Voodoo2 graphics card line; After buying 3dfx, Nvidia acquired this technology but did not use it. In 2004, Nvidia re-introduced the name SLI and intended to use it in modern computer systems based on the PCIe bus; But using it in today’s modern systems required compatible motherboards.
  • Crossfire was also a technology introduced by ATI that allowed the simultaneous use of multiple graphics cards for the motherboard. Due to this technology, a controller chip was installed on the main board, which was responsible for controlling the intermediary channels and integrating their information for display on the screen; Officially, up to 4 graphics cards can be installed as Crossfire, in which case it is called Quad-Crossfire. This technology was first officially introduced in September 2005 to compete with SLI.

Graphic interface

The slot connecting the graphics card to the motherboard or support base was called APG in the past. After APG, another interface called PCI was introduced, and finally, what is known today as the graphics card and motherboard interface is PCIe or PCI Express, which plays the role of connecting, powering the board, and transferring information for the graphics card. In fact, PCI stands for Peripheral Component Interconnect and means peripheral component interface. In 2018, the PCI-SIG consortium published the general objectives of the sixth generation PCIe communication port; Two years later, this port was released in 2020, while its fourth generation has not yet become widespread, and graphics cards for normal use do not use the full capacity of PCIe 3.0.

The PCIe 6.0 port can transfer twice as much data as PCIe 5.0, i.e. 256 gigabytes per second of data through 16 paths, without the need to increase the bandwidth or more working frequencies and using current methods, and it is also compatible with its previous generations so that it can be used Available from older cards in new ports.

PCI Express bandwidth based on data transfer rate per direction (GB/second/direction)

Bandwidth slot

PCIe 1.0

(2003)

PCIe 2.0

(2007)

PCIe 3.0

(2010)

PCIe 4.0

(2017)

PCIe 5.0

(2019)

PCIe 6.0

(2022)

x1

0.25 GB/s

0.5 GB/s

1 gigabyte per second

2 gigabytes per second

4 gigabytes per second

8 gigabytes per second

x2

0.5 GB/s

1 gigabyte per second

2 gigabytes per second

4 gigabytes per second

8 gigabytes per second

16 gigabytes per second

x4

1 gigabyte per second

2 gigabytes per second

4 gigabytes per second

8 gigabytes per second

16 gigabytes per second

32 gigabytes per second

x8

2 gigabit per second

4 gigabytes per second

8 gigabytes per second

16 gigabytes per second

32 gigabytes per second

64 gigabytes per second

x16

4 gigabytes per second

8 gigabytes per second

16 gigabytes per second

32 gigabytes per second

64 gigabytes per second

128 gigabytes per second

Voltage regulator circuit

After the initial power supply by the PCIe interface, the current to the graphics card must be reviewed and adjusted. This task is the responsibility of the voltage regulator circuit (VRM), which provides the electric current required by various parts such as memory and graphics processors. The correct operation and application of regular, adequate, and timely voltages of this circuit can increase the durability of the graphics card and optimize energy consumption. Actually, the voltage regulator circuit decides how the power supply should be done. This circuit consists of four sections: input capacitor, MOSFET, choke, and output capacitor.

  • Input capacitors: current is entered and stored in the circuit through input capacitors and sent to other parts of the circuit when necessary.
  • MOSFETs: MOSFETs act like a bridge and pass the current stored in the input capacitors. There are two low-side and high-side MOSFETs in the voltage controller circuit; When the graphics need current, this current passes through the MOSFET High and when the graphics do not need the current, the current is stored in the MOSFET Low.
  • Chokes: Chokes are electronic components that reduce current noise as much as possible. Graphics need smooth and stable flow to function properly, and chokes provide this by removing noise.
  • Output capacitors: after filtering the current by the chokes and before sending the required current to the desired sections, the output capacitors remove the current from the circuit.

Cooling system

Every graphics card must be at an optimal temperature to perform at its best. The cooling system in the graphic card, in addition to reducing the working temperature of the product, increases the durability of the parts used in it. The system consists of two parts: a heatsink and a fan: the heatsink is usually made of copper or aluminum and is ideally passive. The main purpose of this section is to take heat from the graphics processor and distribute it to the surrounding environment; On the other hand, the fan is an active part of the graphics cooling system that blows air into the heatsink to keep it ready to remove heat. Some low-end graphics cards only have a heatsink, but almost all mid-range and high-end cards are equipped with a combination of a heatsink and a fan for proper and efficient cooling.

Types of graphics processor

Graphics processors in different types perform the task of performing calculations and graphics processing for systems; In the following, we will get to know the types of these graphic units, how they work, and the advantages and disadvantages of each:

iGPU

iGPU (abbreviation of Integrated Graphics Processing Unit) is an integrated graphics processing unit that is placed on the central processor chip or CPU. iGPU may be installed on the motherboard or placed next to the processor (in which case it is considered the same graphics processing unit in the integrated chip). These graphics units generally do not have much processing power and are not suitable for displaying advanced 3D game graphics and animations; Actually, they are designed for basic processing and it is not possible to upgrade them. The use of these graphics allows the system to be thinner and lighter, reducing power consumption and costs. AMD introduces its graphics processors as APUs.

Of course, today there are modern processors that can be surprisingly powerful with integrated graphics. Not all processors are equipped with an integrated GPU; For example, Intel desktop processors whose model number ends with F or the company’s X series processors do not have a graphics processing unit and therefore are sold at a lower price, and you will need a separate graphics card for graphics processing in systems equipped with this processor. had

Currently, AMD and Intel are trying to improve the performance of their integrated graphics processors, and Apple has also surprised many people with its silicon chips, especially in the M1 Max chip, which is a very powerful integrated graphics processor and can handle high-end graphics. to compete

iGPU vs. dGPU

dGPU

dGPU (abbreviation of Discrete Graphics Processing Unit) is a separate graphics processing unit that is used as a dedicated and separate chip in systems. A discrete GPU is usually much more powerful than an iGPU and makes it much easier to analyze large, sophisticated, 3D graphics data. In fact, to use gaming systems or advanced 3D rendering and design, having a powerful dGPU is essential. The discrete GPU can be easily replaced and upgraded; In addition to very high power, these units are also equipped with a dedicated cooling system and do not overheat when performing heavy graphics processing. It can be said that dGPUs are one of the reasons why gaming laptops are more expensive and heavier, with high power consumption and low battery life in these systems compared to normal laptops, for this reason, it is recommended only if you use your system for gaming to produce 3D graphics content. Or you use heavy tasks, buy a separate graphics processor.

Currently, the biggest names in the discrete GPU industry are AMD and Nvidia, although Intel has also recently launched its own laptop GPUs in the form of the Arc series, and plans to launch desktop graphics cards as well.

However, discrete GPUs require a dedicated cooling system to prevent overheating and maximize performance, which is why gaming laptops are much heavier than traditional laptops.

Cloud GPU

The cloud graphic processing unit provides the user with the possibility of using many graphic services on the Internet; That is, without providing a GPU, you can use the processing power of the graphics processor. Of course, it is understandable that cloud graphics do not offer that much special power, but it is suitable for those who do not have a large budget and do not need very advanced graphics processing. This group of people can pay different providers for the graphics cloud processing they receive based on their usage.

eGPU

An external graphics card or eGPU is a graphic that is placed outside the system, is equipped with a PCIe port and a power supply, and can be connected to the system externally through USB-C or Thunderbolt ports. Using these graphics allows the user to use powerful graphics in a compact and light system.

Apple Silicon M1
  • Mac computers equipped with the M1 chip do not support external graphics cards

In recent years, the use of external graphics has increased, and since the power of graphics processing and the quality of output images of laptops are generally lower than desktops, users have recently solved this problem by using external graphics. External or external graphics are mostly used for systems such as laptops, but some companies use these graphics units for older desktops with low processing power. Be careful that if it is possible to upgrade the laptop graphics, the use of external graphics will not be justified, but issues such as the large space required for external graphics, their high cost, etc., make users especially gamers use external graphics cards. Prefer to upgrade your desktop graphics system.

Mobile GPU

Mobile graphics determine our visual experience of phones and can even be decisive for some users (gamers) and show loyalty to a certain brand.

Mobile GPU

The mobile system on a chip (SOC) or in short the same chip that is in today’s phones, besides the central processing unit, has units for artificial intelligence processing, image signal for the camera, modem, and other important equipment, as well as a graphics processing unit. The mobile GPU was introduced to process heavy data such as 3D games and changed the world of phones, especially for gamers. As mentioned, the processing cores in the mobile graphics processor are less powerful than the processing cores in the central processors, but on the other hand, their simultaneous and fast performance makes it possible to display heavy content and complex graphics on phones.

Types of mobile GPUs

ARM is one of the main poles of the production of graphics processing units for phones and the owner of the famous Mali brand, Qualcomm has a large share of the phone graphics market with Adreno graphics processors, Imagination Technologies has been producing Power VR graphics processors for years. and Apple used this company’s graphics processors for a long time before developing its own graphics processor. It is interesting to know that unlike Apple, which has its own graphics processors, Samsung uses ARM or Qualcomm graphics processors for the processing and graphics calculations of its phones.

  • logo; Mali GPU: Mali mobile GPUs are developed by ARM and are sold in different price ranges. For example, the graphics processor used in the Galaxy S21 Ultra is Mali-G78 MP14 and can perform graphics processing at high speed and power.
  • Qualcomm; Adreno graphics processor: Along with the powerful Android processors it produces under the name of Snapdragon, Qualcomm also performs brilliantly in the production of mobile graphics processors. Like Mali GPUs, these units have a wide price range and target market. For example, the Adreno 660 GPU used in the Asus ROG Phone 5 gaming phone in 2021 was recognized as one of Qualcomm’s most powerful graphics.
  • Imagination Technologies; Power VR graphics processor: Power VR graphics processors were once used in the most popular iPhones, but Apple abandoned the use of these graphics units by producing its own graphics for the A Bionic chip; Today, Power VR GPUs are mostly used in affordable MediaTek chips and in budget and mid-range phones from brands like Motorola, Nokia, and Oppo.

Other applications of graphics processors

GPUs were originally developed as an evolved unit of graphics accelerators to help lighten the workload of processors. Until the last two decades, they were often recognized as accelerators for rendering 3D graphics, especially in games. However, since these units have high parallel processing power and can process more data than the central processing unit (CPU), they were gradually used in fields other than gaming, such as machine learning, digital currency mining, etc. In the following, we will learn about other uses of graphics processors other than gaming:

Video editing

Modern graphics cards are equipped with video encoding software and can prepare and format video data before playback. Video encoding is a time-consuming and complex process that takes a lot of time to complete with the help of the central processing unit. With their very fast parallel processing capabilities, GPUs can handle video encoding relatively quickly without overloading system resources. Note that high-resolution video encoding may take some time even with powerful GPUs, but if the GPU supports higher-resolution video formats, it will perform much better than the CPU for video editing.

3D graphics rendering

Although 3D graphics are most commonly used in video games and gaming, they are increasingly being used in other forms of media such as movies, television shows, advertisements, and digital art displays. Creating high-resolution 3D graphics, even with advanced hardware, just like video editing, can be an intensive and time-consuming process.

3D graphics rendering

Modern film studios often depend on advanced GPU technology to produce realistic and dynamic computer graphics, making the hardware a vital part of the filmmaking process. Digital artists also use computers equipped with advanced graphics processors to create abstract works that cannot be produced in the usual physical space and produce works of art different from what we have seen so far. With the right combination of hardware performance and artistic vision, GPUs can be a powerful creative resource for computing and media content processing.

Learning the machine

One of the lesser-known applications of modern GPUs is machine learning. Machine learning is a form of data analysis that automatically builds analytical models. Basically, machine learning uses data to learn, identify patterns, and make decisions independent of human input, and due to the very intensive nature of this system and the need for parallel processing, GPUs can be considered an essential part of this technology.

Graphics in machine learning

Machine learning is considered the foundation of technology and the use of artificial intelligence, and for that reason it is a complex computational process that requires a large amount of data to be entered for analysis . Software known as machine learning algorithms performs and models the analysis based on what is called training data or sample data. These obtained models are used to make predictions or decisions without the need for human intervention. This method has been widely implemented in various fields, from the medical world to email filtering systems to prevent receiving inappropriate content, and has made machine learning a vital aspect of modern data infrastructures.

Blockchain and digital currency mining

One of the more common uses of GPUs besides gaming is their use in mining or digital currency extraction. In the process of mining digital currencies or cryptocurrencies, system resources are placed at the disposal of the blockchain (or a continuous record of complex encryption algorithms for storing transaction data); Each entry in this record is called a block, which requires a certain amount of computing power to produce. Although blockchain technology has applications outside of digital currencies, it is generally used to mine digital currencies (especially Bitcoin); Of course, the mining process can be different depending on the desired digital currency.

GPU in digital currency mining

Specifically, the Bitcoin mining process involves allocating hardware resources to create blocks in the Bitcoin blockchain. The more blocks are added to the blockchain, the more bitcoins are generated. Such a process consumes system resources and power and reduces system productivity when engaged in mining. The high throughput and relatively low energy requirements of GPUs make these units a suitable tool for performing the mining process, which has recently gained a lot of fans.

Frequently asked questions

  • What is the use of a graphics processor (GPU)?

  • What is the difference between CPU and GPU?

  • What is the difference between GPU and VGA?

Continue Reading

Popular