Version: 4.6.1

Monitoring Best Practices

This guide explains how to monitor metrics in real-time and make practical decisions. It covers how to respond to common issues and adjust capacity.

Quick Reference

Key Metrics to Watch:

Queue + Wait Time: Indicates Limited Inflow appropriateness
Inflow vs Outflow: Indicates integration health
Process Time: Indicates server load state

Healthy Indicators:

Inflow ≈ Outflow (within 10-20%)
Process Time: Stable or decreasing
Queue Size: Low with short wait times
Outflow Rate: >80%

Warning Signs:

Process Time increasing steadily → Server stress or performance degradation
Outflow Rate <70% → Missing explicit exits or integration issues
Queue Size growing with increasing wait time → Demand exceeding capacity

Critical Issues (Immediate Action Required):

Process Time spiking dramatically → Reduce Limited Inflow immediately
Outflow Rate <50% → Critical integration failure; reduce Timeout and investigate
Queue Size growing rapidly with high wait time → Capacity exhausted; assess server state

Common Scenarios and Responses

Scenario 1: High Queue, High Wait Time, Server Has Capacity

Symptoms:

Queue Size: High
Average Wait Time: High
Server Resources: Underutilized

What it means: Limited Inflow is set too low. Your server can handle more traffic, but NetFUNNEL is restricting entry too much.

Immediate Actions:

Check server resource utilization (CPU, memory, I/O)
If resources are underutilized, increase Limited Inflow by 10-20%
Monitor Queue Size and Wait Time for 5-10 minutes - they should decrease
If improved, consider another incremental increase

Example:

Current situation:
- Limited Inflow: 100 TPS
- Queue Size: 200 users
- Wait Time: 20 seconds
- Server CPU: 50% (has capacity)

Action: Increase Limited Inflow to 110-120 TPS
Monitor: Check Queue/Wait Time should decrease

Scenario 2: Low Outflow Rate with High Queue

Symptoms:

Inflow: 100 TPS
Outflow: 50 TPS (or lower)
Outflow Rate: <50%
Queue Size: Growing
Wait Time: Increasing

What it means: Explicit service exits are not happening properly. Users are entering but not explicitly returning keys, causing capacity to be held unnecessarily.

Immediate Actions:

Reduce Timeout values immediately:
- If Process Time is 1-2 seconds, set Timeout minimum to 1s and maximum to 2s
- This frees up capacity quickly for new users
- Timeout settings can be adjusted in segment Advanced - Timing (Basic Control) or Advanced - Timing (Section Control)
Monitor Queue Size - it should start decreasing
If queue remains long after Timeout adjustment, consider increasing Limited Inflow (if server resources allow)

Long-term Actions:

Investigate root cause:
- Check if nfStop() calls are missing in code
- Verify integration implementation
- Review error logs for integration failures
Fix integration issues:
- Add nfStop() calls in all appropriate places
- Ensure error handling includes key return
Optimize Timeout settings:
- Set Timeout based on actual Process Time + buffer
- Example: If Process Time averages 5 seconds, set Timeout to 6-7 seconds

Missing Explicit Exits

If Outflow Rate is consistently below 70%, this is a critical integration issue. While Timeout can mitigate the immediate problem, you must fix the root cause.

Scenario 3: Process Time Increasing

Symptoms:

Process Time: Gradually increasing
Queue Size: May or may not be increasing

What it means: Server load is increasing or performance is degrading. This is an indirect indicator of server stress.

Immediate Actions:

Check server resources (CPU, memory via APM or server monitoring)
If server is overloaded: Reduce Limited Inflow by 40-50% immediately
If server has capacity: May be other issues (network, database, etc.) - investigate
Monitor Process Time trends - if it continues increasing, reduce Limited Inflow further

Example:

Current situation:
- Limited Inflow: 100 TPS
- Process Time: Increasing from 2s to 5s
- Server CPU: 90% (overloaded)

Action: Reduce Limited Inflow to 50-60 TPS immediately
Monitor: Check Process Time should stabilize

Long-term Actions:

Performance optimization:
- Profile application to identify slow components
- Optimize database queries
- Scale server resources if needed
Capacity planning:
- Determine optimal Limited Inflow based on Process Time thresholds
- Set up alerts for Process Time exceeding thresholds

Timeout Optimization

Timeout settings determine how long NetFUNNEL waits before automatically returning keys when explicit exits don't occur.

Default Range: 6-20 seconds (minimum-maximum)

How It Works:

NetFUNNEL uses the maximum value initially
Keys are automatically returned after timeout if nfStop() isn't called

Setting Optimal Timeout:

Monitor Process Time over time to identify typical range
Set minimum to typical minimum Process Time
Set maximum to typical maximum Process Time + 20-30% buffer
Example: If Process Time is 8-12s, set Timeout to 8-15s
If Outflow Rate is low (<70%), reduce Timeout to free capacity faster

Timeout vs Process Time

If Process Time regularly exceeds Timeout, users will be forcibly exited before service completion. Always set Timeout above typical Process Time with a safety buffer.

Important Notes

About Limited Inflow adjustments:

Increase: 10-20% incrementally, monitor for 5-10 minutes
Decrease: 40-50% aggressively when protecting server

About Process Time:

NetFUNNEL doesn't directly monitor server CPU/memory
Process Time is an indirect indicator of server load
Always cross-reference with your server monitoring tools (APM, etc.)

Quick Reference​

Common Scenarios and Responses​

Scenario 1: High Queue, High Wait Time, Server Has Capacity​

Scenario 2: Low Outflow Rate with High Queue​

Scenario 3: Process Time Increasing​

Timeout Optimization​

Important Notes​

Quick Reference

Common Scenarios and Responses

Scenario 1: High Queue, High Wait Time, Server Has Capacity

Scenario 2: Low Outflow Rate with High Queue

Scenario 3: Process Time Increasing

Timeout Optimization

Important Notes