TL;DR - To calculate directionally accurate data size (and hence costs) for vector databases. File systems report PDF file sizes 5x to 20x the size of the actual text content in it because of the ...