AI Cannot Experiment

AI Cannot Experiment: What a Simple Linux Command Tells Us

Linux or UNIX has a ‘sar’ command that “writes to standard output the contents of selected cumulative activity counters in the operating system”, showing system stats for each (by default) 10-minute interval like this

12:00:01 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
12:10:02 AM     all      5.29      0.06      0.70      0.29      0.00     93.65
12:20:01 AM     all      3.14      0.00      0.51      0.06      0.00     96.28

(Ignore the first timestamp “12:00:01 AM”, which really should be the name of the header such as “timestamp”.) A natural question is, Is the time in the first column the starting or ending moment of the 10-minute interval? For example, does “12:10:02 AM” mean that the CPU usage, IO Wait, etc. to the right are the average stats beginning at 12:10:02 AM and ending at 12:20:00 AM, or the average for the past 10 minutes up to 12:10:02 AM? Surprisingly, Google searches using various keyword combinations return no webpage talking about it. Then I checked a few UNIX or Linux system admin books I bought many years ago, and searched and found the source code of the ‘sar’ command. None offers a clear answer, although I admit I didn’t spend much time reading the code. Finally, I posed this question to various AI websites:

According to https://github.com/sysstat/sysstat/blob/master/sar.c , can we tell whether the time displayed at the beginning of each row of the Linux sar command output is the beginning or ending moment of the 10-minute time range?

It’s funny that Google, ChatGPT, and DeepSeek all confidently tell me the timestamp is the beginning of the interval. (To trigger Google’s AI, I have to omit the “According to …” clause.) ChatGPT does not say it consulted the source code. DeepSeek says it is “[b]ased on the sar.c source code from the systat GitHub repository” and its “Code Context” section mentions get_time() and get_localtime() functions. But when I asked where the functions are called as I don’t see them in sar.c, it apologized. Perplexity AI, however, tells me the timestamp is the ending of the interval, with no mention of the source code. My question is important because I’m in the middle of writing an email to be sent to my manager and a coworker who respond to our director’s inquiry on a performance problem that happened a few days ago. For a long time, I’ve interpreted the time as the ending time of the interval. But today I wanted to be cautious so I checked, and yet, what a surprise!, these state-of-the-art AI’s gave me conflicting answers.

Then I had a second thought. Let’s check the output of ‘sar’ against the current time, e.g.,

$ date
Fri Mar 14 09:57:43 AM CDT 2025
$ sar | tail -3
09:40:01 AM     all      2.45      0.00      1.09      0.11      0.00     96.35
09:50:01 AM     all      5.98      0.04      1.10      0.27      0.00     92.62
Average:        all      4.93      0.00      0.94      0.47      0.00     93.65

In the above output, the last line above “Average” shows 09:50:01. It must be the ending time of the interval for this row, representing the time range 09:40–09:50. If it were the beginning time, it would represent the time range 09:50 to 10:00. But the current time is 09:57, not yet 10:00. How could ‘sar’ possibly know the stats in the time that has not come?

The author of sar.c has his email contact right in the C source code. So I emailed him. He replied and confirmed my understanding. Problem solved!

The afterthought of this inquiry is that, in spite of the high “intelligence” of the most sophisticated AI’s, one thing they cannot do, at least for now, is experiment. Of the four AI’s I tested, only Perplexity got the answer right. But it did not perform a test to reach the conclusion, at least not telling me it did. In fact, we can safely assume that none of them actually ran the ‘sar’ command on a Linux (or UNIX) machine before answering my question.

Further, even if AI did, it would have a hard time to interpret the result, or would simply ignore it. To test this hypothesis, I asked ChatGPT and DeepSeek the question again, this time providing my ‘date’ and ‘sar | tail -3’ commands and output (see above), without my words about reasoning. A person with moderate intelligence, not necessarily knowing Linux or UNIX, should be able to infer that the timestamp in the first column of the ‘sar’ output must be the ending time of the interval, and yet both ChatGPT and DeepSeek still insist it is the starting time.

Finding answers by doing experiments has been the core of modern science since the 16th-17th century when the Renaissance scientist Galileo helped lay the foundation. AI obviously won’t be able to do a chemistry, physics, biology, or any engineering experiment. But an “experiment” of running a command on a computer? Apparently it doesn’t bother to do it, either. This may change, though, as AI is evolving every day. Let’s suppose Intelligence level 1 is in searching, finding and summarizing existing information, level 2 in inferring from existing information, and level 3 in doing an experiment itself and making inferences. The current AI may be in the beginning phase of level 2. There must be a long way from the current stage to the highest level, level 3.

March 2025
also published on Medium

Contact me
To my Computer Page