[ltt-dev] [RFC PATCH] block: Fix bio merge induced high I/O latency

Mathieu Desnoyers mathieu.desnoyers at polymtl.ca
Mon Feb 2 19:46:32 EST 2009


* Jens Axboe (jens.axboe at oracle.com) wrote:
> It's also working around the real problem for this specific issue, which
> is that you just don't want to have sync apps blocked waiting for async
> writeout in the first place.
> 

Maybe I could help to identify criterion for such sync requests which
are treated as async. From a newcomer's look at the situation, I would
assume that :

- Small I/O requests
- I/O requests caused by major page faults, except those caused by
  access to mmapped files which result in large consecutive file
  reads/writes.

Should never *ever* fall into the async I/O request path. Am I correct ?
If yes, then I could trigger some tracing test cases and identify the
faulty scenarios with LTTng. Maybe the solution does not sit only within
the block I/O layer :

I guess we would also have to find out what is considered a "large" and
a "small" I/O request. I think using open() flags to specify if
I/O is expected to be synchronous or asynchronous for a particular file
would be a good start (AFAIK, only O_DIRECT seems to be close to this,
but it also has the side-effect of not using any kernel buffering, which
I am not sure is wanted in every case). If this implies adding new
flags to open(), then supporting older apps could be done by heuristics
on the size of the requests. New applications which have very specific
needs (e.g. large synchronous I/O) could be tuned with the new flags.
Any small request coming from the page fault handler would be treated as
synchronous. Requests coming from the page fault handler on a
particular mmapped file would behave following the sync/async flags of
the associated open(). If not flag is specified, the heuristic would
apply to the resulting merged requests from the page fault handler.
Therefore, large consecutive reads of mmapped files would fall in the
"async" category by default. mmap of shared libraries and memory mapping
done by exec() should clearly specify the "sync" flag, because those
accesses *will* cause delays when the application needs to be executed.

Hopefully what I am saying here makes sense. If you have links to some
background information to point me to so I get a better understanding of
how async vs sync requests are handled by the CFQ, I would greatly
appreciate.

Best regards,

Mathieu




-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68




More information about the lttng-dev mailing list