PDF File: Wash Trading in Crypto Exchanges
The world’s been shocked by the fall of FTX, major cryptocurrency exchange. Due to poor accounting, internal processes around accountability, high leverage lending, and a secret backdoor connection with Alameda Research, FTX went from $32 billion company to essentially 0 in the span of a week. Even for those who don’t believe crypto holds value, this has mattered. $2 billion worth of customer funds, much of which are savings from retail investors, have simply disappeared.
If you’re a crypto skeptic, you should care about making exchanges more transparent so that unsophisticated retail investors aren’t scammed out of their life savings. If you’re a crypto believer, you should also care so decentralisation is achieved meaningfully: through disrupting ‘rent-seeking industries’ instead of being scammy.
So, is there a way we can identify ‘fake exchanges’? Specifically, we’ll look for a subset of fake exchanges: ones that inflate their total transaction volume (TTV) with ‘wash trading’, where a single trader buys and sells the same security to generate misleading market information.
To do this, we’ll start by thinking about what real exchanges should look like, and then apply those heuristics to a couple major exchanges and predict which ones wash trade.
Sure, if you weren’t working with FTX or Alameda closely, there was no way to predict the events of the last month. But in the future, maybe we can stay away from exchanges that are doing the transaction volume equivalent of shouting “Bitconnect”.
To form these heuristics, let’s understand the two main types of fake exchanges. The first simply lie about displayed trades. Admittedly, this is rare since transactions can be publicly tracked due to the nature of the blockchain (etherscan, blockcypher etc), but it still happens. The second is to identify exchanges that practice ‘wash trading’ - which is more common. This is when there is real transaction volume, but the exchange is the one buying and selling cryptocurrencies and assets on itself to boost volume. Many of these exchanges are ‘pump and dump schemes’, where social media attention paired with seemingly boosted transaction volume which creates a positive feedback loop of increased real volume on the exchange before the exchange shuts down and runs away with customer funds.
Given both strategies ultimately artificially inflate TTV, we can form three heuristics. These are the mix of buy and sell, trade volume, and candle size - we’ll elaborate on these individually. We’ll only consider the trading volume of $BTC since it is universal and enjoys the highest TTV globally, making comparison easier across exchanges. Together, they should give us a good picture about the degree to which an exchange is wash trading.
We will use these heuristics to assess the following exchanges, with Coinbase certainly not participating in wash trading (please refer to appendix) - we will investigate several top #1000 exchanges. Further reasoning for the choice of exchanges is in the appendix. I have chosen to not use Binance since they have not been audited in the past.
Exchange | Coin Market Cap Ranking (29 Nov 2022) | Daily TTV ($USD) (29 Nov 2022) |
---|---|---|
Coinbase | 2 | 1,297,709,600 |
Bithumb | 13 | 197,762,982 |
BkEx | 18 | 544,836,394 |
LBank | 20 | 1,483,981,969 |
Bitforex | 60 | 692,626,406 |
IndoEx | 83 | 2,151,321,864 |
The ‘runs’ of buy and sell orders in an exchange that does not wash trade should be unequal and streaky. Consider the following sequence of buys and sells where “B” is Buy and “S” is Sell:
"BBBBBBSSSBBBSBBBSSBSBSSSS"
The sequence of runs here would be “BBBBBB”, “SSS”, “BBB” and so on. Intuitively, these runs should be unequal because each buy is not always instantly met with a sell, making the runs both unequal and streaky. A constant alternating pattern like the following should draw suspicion:
"BSBSBSBSBSBSBBSBSSBSBSBSBS"
We can think of the transaction type as a bernoulli , following a geometric distribution $P(X)~$~ $Geom(n, p)$. We can assume that the p of the geometric approaches 0.5 the more wash trading occurs, since every buy has to be met with a sell. This is not necessarily true generally speaking. Additionally, while we cannot assume that buys and sells are I.I.D since a buy decreases the likelihood of a following buy, we will proceed with that assumption for now (this is an important caveat discussed in the appendix). We will compute the best p (or the prior that a transaction is buy) by taking the mode of the Beta, where here it is a distribution over the probability of a Bernoulli random variable. Since the PDF of a Beta is given by:
$f(x) = \\frac{(x-a)^{p-1}(b-x)^{q-1}}{B(p, q)(b-a)^{p + q -1}}$
and the mode is given by:
$\\frac{p-1}{p + q - 2}$
we can compute this from the parsed string.