Simulate GROUP BY with awk

Recently, I got a necessity of calculation an average value for each group of a data in my clipboard. To avoid creating a temporary file or using Excel for that, I used awk tool. Just writing it down for a future reference.

Example source data:

1
2
3
4
5
6
23100 1600
20500 1600
19400 1600
36900 512
43100 512
48000 512
1
2
3
4
5
6
7
8
9
$ # SUM
$ awk '{arr[$2]+=$1} END {for (i in arr) {print i, arr[i]}}'
512 128000
1600 63000
$ # AVG
$ awk '{arr_sum[$2]+=$1; arr_cnt[$2]+=1} END \
       {for (i in arr_sum) {print i, arr_sum[i]/arr_cnt[i]}}'
512 42666.7
1600 21000

Read more

What MySQL is silent about

Test:

1
2
3
4
5
6
7
8
CREATE TABLE test_int (
    u32 INT(10) UNSIGNED,
    i32 INT(10)
);

INSERT INTO test_int (u32, i32) VALUES(4294967295, 4294967295);

SELECT * FROM test_int;

Output:

1
2
3
4
5
+------------+------------+
| u32        | i32        |
+------------+------------+
| 4294967295 | 2147483647 |
+------------+------------+

As you can see, MySQL silently cut the value if doesn’t fit to a column type.

For example, PostgreSQL gives the following error in such case:

1
ERROR:  smallint out of range

Test for PostgreSQL:

1
2
3
4
5
6
CREATE TABLE test_int (
    small smallint,
    medium integer
);

INSERT INTO test_int (small, medium) VALUES(4294967295, 4294967295);

😒

Read more