Description
When a node detects high memory use (>80% of quota) stop processing new requests until it falls. If the system free memory drops below 10%, halt all active requests and release memory in an effort to prevent the node from becoming unresponsive / an OOM kill of the service.
Also provide a feature control flag to enforce a configured node-quota as the hard-limit (i.e. the request-halt threshold) rather than using the system free memory.
In the absence of an explicit node-quota, we'll use 50% of the system memory. (Ideally we want to move away from systems without a node-quota.)