The ChipList, by Adrian Offerman; The Processor Portal

Processor Selector

View: show / edit

bookmark bookmark site
bookmark permalink
Mon 5 Mar 2012, 9:00


There are 8 MEM_TRANS_RETIRED.LOAD_LATENCY_GT_* precise events available on Intel® Microarchitecture Codename Sandy Bridge.  The events allow you to pinpoint loads that exceeded a given latency, measured in CPU clock cycles.  For example, the MEM_TRANS_RETIRED.LOAD_LATENCY_GT_4 event is for loads exceeding 4 clocks in latency, and the MEM_TRANS_RETIRED.LOAD_LATENCY_GT_512 event is for loads longer than 512 clocks. 

These events are sampled by Intel® VTune™ Amplifier XE performance profiler in a different way from most other events.  When a user elects to sample one of these events, special hardware is used that can keep track of a data load from issue to completion.  This is more complicated than simply counting instances of an event (as with normal event-based sampling), and so only some loads are tracked.  Loads are randomly chosen, the latency determined for each, and the correct event(s) incremented (latency >4, >8, >16, etc).  Due to the nature of the sampling for this event, only a small percentage of an application's data loads can be tracked at any one time. 

By sampling a range of latencies with this event, you can determine your application's general latency distribution and pinpoint (because the event is precise) any overly long loads.  But data ...
Filed under: Intel® VTune™ Amplifier XE Knowledge Base
Also published by:

ChipList news channel Last Months News

ChipList developers news channel Last Months Developers News

Page viewed 1576 times since Tue 6 Mar 2012, 22:42.