I have something like a netflow table and would like to group it in such a way that it groups it by (src_ip, src_port, dst_ip, dst_port) where the values could be swapped between the src and dst fields.
src_ip | src_port | dst_ip | dst_port | bytes_sent |
---|---|---|---|---|
192.168.1.1 | 123 | 192.168.10.5 | 321 | 111 |
192.168.10.5 | 321 | 192.168.1.1 | 123 | 222 |
10.0.0.5 | 50 | 172.0.0.5 | 55 | 500 |
172.0.0.5 | 55 | 10.0.0.5 | 50 | 300 |
192.168.1.1 | 123 | 192.168.10.5 | 321 | 1000 |
I would like to have the following result from this table:
src_ip | src_port | dst_ip | dst_port | bytes_sent | bytes_recv |
---|---|---|---|---|---|
192.168.1.1 | 123 | 192.168.10.5 | 321 | 1111 | 222 |
10.0.0.5 | 50 | 172.0.0.5 | 55 | 500 | 800 |
Basically, trying to capture the traffic going both ways in a single row. So something like grouping by (src_ip, src_port) and (dst_ip, dst_port) where those values could be inverted. What would be the best way to achieve this?
2
Answers
Assuming the smallest IP is the source IP, and the largest is the destination IP.
You can use
LEAST
andGREATEST
functions to ensures that for each combination of the smallest and largest IP addresses, one entry will be selected :Demo here
You can use a self-
join
:See fiddle