One of the oddest pricing metrics I have encountered is in the used book business. People in the used book trade often by books by weight. The going rate in the US today seems to be from 5¢ to 25¢ per pound with the unit of purchase being the pallet (a pallet of books is about 1,300 pounds so it will cost between $65 and $325 per pallet).

Buying books by the pound! I can’t think of anything further from value-based pricing. Why does this make sense?

It is almost like mining. The books bought by the pound are the paydirt: “dirt, rock, or other raw materials which contains (or may contain) profitable quantities of ore or other valuable materials.” To make paydirt valuable, one needs to do a lot of processing and then transport it to market. The same with the books, though in this case the processing is done in the seller’s warehouse where it is more convenient. Buying books by the pound is a way to manage uncertainty about value. It reduces all of the books to their lowest common denominator and uses that as the pricing metric. If uncertainty about value could be reduced, the price would go up.

In fact, that is exactly what happens. Buyers use a variety of techniques to reduce uncertainty. Sometimes they buy a sample. The minimum buy is usually a pallet, bit sometimes it is a truck. One then works through the sample, figures out what is in the sample, and determines how much to pay. Of course one has to hope it is really a representative sample.

The other technique is provenance. Where did the books come from? Have they already been picked over and all the best books picked or are they fresh to the market? Are they the discards from a small local library or are they from a private estate (private estates are generally more valuable than other sources, these are the shipments selling at 25¢ or more per pound).

An important thing to remember about the used book trade is that there is a floor price. If you can’t sell the book you can sell it for pulp.

Is there anything like this in software? In fact yes. There is a small but emerging market for undifferentiated data sets. There are some companies that are buying data by volume (megabytes, or even terabytes). The rapid improvements in data mining are making it worth buying undifferentiated data (the paydirt) and processing it into ore (nuggets of insights or even predictions).

Unfortunately, some of this is happening in the grey market or even the black market. Over the past six months, the inevitable offers by questionable development shops have been joined by people offering data of dubious provenance, or even legality. The prices are crazy low, but there must be someone that is buying or there would not be so many sellers.

We are not suggesting you buy any of this data. Don’t. Only buy data if you know it has been legally sourced and you can test the value. But if you are selling data, and many organizations are, you want to avoid selling it by volume. How do you do that?

Don’t sell paydirt, sell ore.

Know your data, and be able to describe its properties to possible buyers. You have to be able to do some data mining yourself so that you are providing the assays to the buyer and not relying on them. Reducing search costs increases value. This is especially true in a world awash in data of all levels of quality and relevance.
Differentiate your data. Anything undifferentiated gets sold at the market price. The market price trends to the variable cost. The variable cost of data approaches zero. Remember that differentiation is always (i) for a specific customer or customer segment and (ii) relative to the next best competitive alternative. A market segment is defined as a set of buyers who get value in the same way.
Add value to the data through rich metadata, at the file level and through embedded metadata. Just what metadata you use to enhance value will depend on your target markets and their use cases.

Why do they sell used books by weight (and what that implies for pricing data)

Service Design and Experience - Thoughts on Majid Iqbal's O-P/E=N Model

Do you price inputs or outputs?