Postgresql - SQL table with thousands of columns

YueJIN
September 7, 2024
102 views
0 votes
2 Answers

I need to store data(float) from 20,000 sensors once a second. I originally wanted to create the following table：

time	sensor 1	sensor 2	…	sensor 20000
2024-09-06 13:00:00	1.2	5.3	….	2.0

But then I found a table cannot have more than 1600 columns in PostgreSQL. What’s the best practice to store this kind of data? Separate them into multiple tables or switch to another type of DB?

All 20000 sensor values are read and inserted together.

I need to query up to 100 of them per second to plot trend charts.

Tags: postgresql sql

Answers

- Bohemian
- September 6, 2024 at 11:49 am
- 0 votes
0
Store all 20000 sensor readings in a json column:
```
create table sensor (
  time timestamp,
  readings json
);

insert into sensor values (now(), '{"sensor1":1.2, "sensor2":5.3, ..., "sensor20000":2.0}');
```
Or if you always have all the readings use an array type:
```
create table sensor (
  time timestamp,
  readings decimal[]
);

insert into sensor values (now(), '{1.2, 5.3, ..., 2.0}');
-- or
insert into sensor values (now(), array[1.2, 5.3, ..., 2.0]);
```
Login or Signup to reply.

Here’s how much space it takes to store 1 minute of the randomly generated per-second readings from 20k sensors with 10% sparsity (they share setseed(), so the random data they save is the exact same):

`numeric[]` SQL array	`jsonb` array	`jsonb` object	`hstore`	entity-attribute-value
11MB	11MB	15MB	38MB	55MB

Column names link to documentation, cells link to db<>fiddle demos you can play around with.

In each case you can save space by reducing precision and scale of your readings, e.g. using a numeric(4,2). That results in the array going down in size to 2.5MB and also shows how much of the EAV is just overhead and duplication of the time and sensor signatures, as it only shrinks to 46MB.

Space consumption is only one of the factors, but you can use these as a starting point for further tests.

`numeric[]`:

create table your_signals(measured_at timestamptz, reading_values numeric[]);
select setseed(.42);--makes this test repeatable
insert into your_signals values
  (now(),(select array_agg(case when .9>random() then random() end) 
          from generate_series(1,2e4)n2) );
select measured_at
  ,reading_values[1] as s1
  ,reading_values[5] as s5
  ,reading_values[9999] as s9999
  ,reading_values[20000] as s20000
from your_signals

measured_at	s1	s5	s9999	s20000
2024-09-07 12:46:21.978572+01	0.362470663311556	0.5754996219675	0.800965200844344	0.906566857051784

`jsonb` array:

create table your_signals(measured_at timestamptz, reading_values jsonb);
select setseed(.42);--makes this test repeatable
insert into your_signals values
  (now(),(select jsonb_agg(case when .9>random() then random()::numeric(30,15) end)
          from generate_series(1,2e4)n2)
  );
select measured_at
  ,reading_values[0] as s1
  ,reading_values[4] as s5
  ,reading_values[9998] as s9999
  ,reading_values[19999] as s20000
from your_signals

measured_at	s1	s5	s9999	s20000
2024-09-07 12:55:51.234168+01	0.362470663311556	0.575499621967500	0.800965200844344	0.906566857051784

`jsonb` object:

create table your_signals(measured_at timestamptz, reading_values jsonb);
select setseed(.42);--makes this test repeatable
insert into your_signals values
  (now(),(select jsonb_object_agg('s'||n2,random()::numeric(30,15))filter(where .9>random()) 
          from generate_series(1,2e4)n2) 
  );
select measured_at
  ,reading_values['s1'] as s1
  ,reading_values['s5'] as s5
  ,reading_values['s9999'] as s9999
  ,reading_values['s20000'] as s20000
from your_signals

measured_at	s1	s5	s9999	s20000
2024-09-07 12:59:48.442463+01	0.362470663311556	0.575499621967500	0.800965200844344	0.906566857051784

entity-attribute-value:

create table your_signals(
       measured_at timestamptz
     , source_sensor smallint
     , reading_value numeric);

select setseed(.42);--makes this test repeatable
insert into your_signals 
select now(), n, random() 
from generate_series(1,2e4)n
where .9>random();

--requires a pivot to view sensors in columns
select measured_at
      ,min(reading_value)filter(where source_sensor=1)     as s1
      ,min(reading_value)filter(where source_sensor=5)     as s5
      ,min(reading_value)filter(where source_sensor=9999)  as s9999
      ,min(reading_value)filter(where source_sensor=20000) as s20000
from your_signals
where source_sensor in (1,5,9999,20000)
group by measured_at;

measured_at	s1	s5	s9999	s20000
2024-09-07 12:58:24.030178+01	0.362470663311556	0.5754996219675	0.800965200844344	0.906566857051784

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – SQL table with thousands of columns

Answers

numeric[]:

jsonb array:

jsonb object:

entity-attribute-value:

`numeric[]`:

`jsonb` array:

`jsonb` object: