Hi.
The only benchmark that makes any sense to me is your normal workload.
Suppose IT gave you a VM that allowed twice as many CPUs and/or a worderful array of SSD disks as you have on your physical machine, and that it allowed your workload to be processed in far less time on the VM. That would be a definite plus.
At a university computer center, whenever we would make changes, we would run a sample of user applications. If we saw a large increase or decrease in real time, we would look closely to try to determine the cause (it was almost always a failure of new code, many cases would crash, decreasing the real time, etc.).
Running benchmarks for a few megabytes would not be meaningful for me. I have run bonnie++ on different configurations (various RAID combinations, for example), so that might be useful if you let it run long enough -- 20-60 minutes on your physical machine, then compare it to runs on the VM.
Synthetic benchmarks are useful if they are similar to your day-to-day work, otherwise, they are perhaps better suited for water-cooler discussions.
While working at an NSF-funded lab, one book I referred to was
The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling: Raj Jain: 9780471503361: Amazon.com: Books
See especially chapter 4, techniques and tools, but note that this is quite an old book.
Best wishes ... cheers, drl