High-frequency trading (HFT) has received a lot of attention during the past couple of years, turning into an increasingly important component of financial markets. HFT is all about the speed: the faster your computer algorithms can analyze stock exchanges and execute trade orders, the higher is your profit.
So the ‘arms race’ in this area never stops with market players continuously investing in more powerful solutions, able to trade securities, derivatives and other financial instruments in a matter of nanoseconds. Only those HFT firms that keep pace with technological innovations, will be able to secure their competitive advantage in the future.
To reduce the time needed for the market data round-trip, investment banks, hedge funds, and institutional investors spend big sums of money on faster software, networks with lower latency, and computing facilities closer to stock exchanges.
When it comes to hardware acceleration, the solution often is to offload compute-intensive portions of trading functions to GPUs, FPGAs, or custom processors. CPUs are still valuable for the implementation of certain tasks, but they are no longer able to maintain the required speed of trade execution.
What is FPGA Technology?
To answer the question “What is FPGA technology”, we need to take a closer look at its components and structure.
Strange as it may sound, FPGA is nothing more than a chip containing a million of logic blocks repeated throughout the silicon. Think of a microprocessor from your laptop or smartphone that can be programmed to perform zillions of operations in a blink. Each of the logic blocks called lookup tables (LUTs) includes basic logical operations such as Boolean AND, OR, NAND, or XOR.
To form an algorithm LUTs are connected to each other in a specific order by means of configurable switches. Both LUTs and the surrounding interconnect fabric are programmable, providing a flexible system, which can be easily adjusted to implement almost any algorithm.

The first commercially viable FPGA device was invented in 1985 by the co-founders of Xilinx. At that point, the chip’s capacity was relatively small, so it was hard to implement a complete logic in one cell.
Today’s FPGAs have mega-million gate counts that allow them to accommodate very complex and large scale designs. It is not surprising this component is often seen as a hardware analog of a program.
Ready to partner with FPGA experts?
FPGA Ultra-Low Latency Drivers
The programmability and extensive capacity of FPGA chips are certainly very important characteristics. But these are the hardware’s parallel architecture and deterministic nature that make it an ultimate solution for reducing round-trip latencies and thus increasing trade volumes.

Parallel Architecture
FPGA devices do not have a fixed processor architecture, including the operating system overhead and all the interfaces and interrupts typical to CPUs.
Processing paths in this hardware are parallel, which means different functions do not have to compete for the same operating resources. As a result, a single FPGA chip can have 10 or more control loops running on it simultaneously at different rates.
The parallel architecture of FPGA is the key behind its ability to rapidly execute buy and sell orders.
The implementation of mathematical computations at a low level, however, is possible only with regards to simple algorithms that can be broken down into a set of tasks. Separate functional blocks can then be processed within different cycles.
The parallelism of FPGA devices also makes them very resilient. Not being affected by the software updates and changes typical for IT systems, the hardware is able to provide and maintain a high level of service at all times. Stable and self-contained in nature, an FPGA chip contributes to the smooth functioning of the whole HFT infrastructure.
Unlike FPGAs, generic processors are better at dealing with complex problems that require less parallelism. When it comes to high-frequency trading such problems, for instance, include calculating the total cost of the end buys, sells and cancels necessary to keep portfolios risk-adjusted.
Another example is the population of price and news sources into trading indicators to be subsequently used by traders and managers to decide on the correct adjustments to trading systems.
Determinism
The hardware implementation of an algorithm results in a high level of determinism. This means that even during the market bursts when the networks are overloaded with information, FPGA quickly transmits the data from the trading venue and back. Irrespective of the network conditions, the chip always passes through the same sequence of states, providing the same output for every given input.
Since the occurrence of random events within the processing paths is very limited, FPGA components deliver a repeatable and predictable processing latency. Moreover, a finite number of operating states guarantees a lower risk of functional errors and a complete test coverage. This gives users of the FPGA-accelerated systems a high confidence in the output integrity.
CPUs, on the other hand, are well-known for their processing randomness. This is due to the operating system and event driven interrupts which give a near-infinite number of path variations through the program flow. CPUs are then become indispensable when it comes to switching between different tasks and solving problems which are constantly changing both in size and in scope.
Accelerating the HFT Engine
Due to their parallelism and determinism, FPGA solutions can significantly accelerate the computation of mathematical models and transmission of data to the exchanges’ matching engines.
In the aggregate of their capabilities, these chips are probably inferior to standard processors. But when it comes to the concurrent implementation of simple, repetitive, and wide tasks, FPGAs beat all the speed records ever shown by CPUs.
High Frequency Trading System Architecture
At the bare minimum, a high frequency trading system architecture always includes:
- Input – live market data
- Output – trading orders
- Trading strategy – trading algorithms
The input component is responsible for the non-stop processing of the live market data and most often includes a market data parser that brings all inbound exchange protocol to a single format.
The output block is represented by an order gateway that converts internal order formats to exchange protocols. And of course, it’s up to trading algorithms to decide whether to complete a trade or not.
Naturally, any high frequency trading system architecture involves a monitoring GUI that offers candlestick charts and other diagrams to assess the performance of an HFT system.
On the hardware side of things, high frequency trading network architecture implies the use of ultra-fast network communications, high-performance switches and routers, specialized servers, and operating system optimization like kernel-bypassing.
What HFT tasks can be processed in FPGA
The use of FPGA platforms in high-frequency trading enables companies to collect, cleanse, enrich, and disseminate the burgeoning array of rapidly changing financial data in short terms. Without loading a CPU, FPGA hardware is able to quickly execute various trading tasks, which among others include:
- Parsing the incoming data, providing data filtering, decoding, and normalization
- Carrying out pre-trade volume, price, and collateral checks
- Monitoring the value and loss scenarios of financial portfolios on an ongoing basis
- Calculating yields from fixed income investments, prices of securities and their derivatives
- Generating outgoing orders, transmitting them to the matching engines
Custom FPGA Design Solutions
Our customers ask about implementing their strategies in FPGA. We thought that sharing Velvetech’s expertise in custom FPGA Design Solutions would be an interesting conversation. See below a few quick examples of how custom FPGA programming can work for traders:



