Recently, I spend almost 3 weeks in optimizing one of our Broadband data aggregator driver to handle the nearly 800Mbps. So many optimizations, like branch prediction optimization, efficient buffer management, removal of unused checks and NAPI, are added in the Ethernet and Packet Engine Drivers.
Here is my understanding about NAPI: In the recent years, network speed is increased drastically and need for handling the packets at wire speed also become very important. The old model of network drivers are not performing well at high speed and suffer due to the large number of interrupts. The NAPI (New API) design introduces the Interrupt Migration and Packet Throttling techniques to medicate the performance issue.
We will see simple example to understand the logic behind the NAPI:I am running a small (imaginary) office. Every day, lot of people will come here to discuss and get suggestions from me, about some issue. In the early morning, only very few people will be visiting. So that I will put a calling bell in front of the office and continue my work (or sleep). When someone comes, (s)he will ring the bell to interrupt me. So that, I can attend them. In the evening, lot of people will be visiting. If every one starts riniging the bell then I will waste most of my time to check the incoming person and request him/her to wait. So in the evening, I will remove the calling bell and put a board to inform them to wait for their turn. Due to the heavy rush, around 8 to 10 people may be in waiting. To make them comfortable, I arranged few (may be 10) chairs. If all chairs are full then the new person will not wait and will come back after sometime. Whenever I complete a discussion with a person, I can go out and invite another person for the next discussion. Similar kind of technique is used in NAPI also. When the incoming packets count is less then the threshold value, the interrupt will be enabled and polling will be disabled. So that, instead of wasting processor time in periodic polling for packets, it can do some other useful work or go into sleep to save the power. When the incoming packets count is more then threshold value, the interrupt will be disabled and polling will be enabled. So that, instead of wasting processor time in interrupt acknowledgment, it can efficiently handle already accepted packets. When polling is enabled, the incoming packets are stored in circular buffer and when the circular buffer is full, the newly arriving packet will be dropped. The protocol will ensure the retransmission of the dropped packets.
Interrupt is best, if the packets are arriving at lower rate. Polling is best, if the packets are arriving at high rate, because on every poll, we will get a packet.