Green Endian
Scientific History is a potential use case of C21, our patented codec technology.
Russia released a rare documentary in August 2020 before the war. It was the footage of testing the biggest bomb to day, the Tsar Bomba. It was dropped over Novaya Zemlja in 1961.
Illustration from YouTube. Click to follow ...
It was a fusion bomb, one that uses the energy of colliding hydrogen atoms into helium to gain energy. The footage states that fusion bombs are less polluting than fission bombs, showing scientists exploring the land after the explosion. This is not true. Historic data about background radiation has a spike from the year, when the Tsar Bomba was dropped. Fusion has such a big energy that the radiation triggers fission in nearby non-hydrogen elements in the air and on the ground leaving significant pollution behind.
The footage was kept in surprisingly good condition. Analog was the common standard that time. How do you record such a good scientific historical record in digital format?
The current standard number representation of Intel, AMD and ARM processors are little and big endian. These record the brightness of each pixel, a point in the image with a binary encoded number with increasing or decreasing significance. It is similar to the decimal Arabic numbers that we use, in two values, zero or one.
Headers describing endianness may not be interpreted easily in the future. Is there a format that a scientist a thousand years later could read easier? There is. There is a limitation of binary formats. Most common numbers running in a CISC or RISC class microprocessor are usually low value. Many parameters are just boolean yes, or no represented by lots of zeros and a one.
Computer logic has hot spots. Iterations usually run up to numbers of a few dozens making smaller numbers more prevalent. It is called a hotspot in statistics.
Eventually a scientist will try to look at images in AD 3000. An artificial intelligence model can actually find patterns across lines finding the width of each line and the stride to the beginning of the next line. This is the easiest, if the density of ones is proportional to the brightness of the pixels. More ones, the higher number it is.
The learning process has three steps. It finds the blurry hotspots in the binary stream with more ones identifying the eight bit representation of pixels. Then, it can easily find the line widths and strides. Eventually, slopes can reveal the byte ordering making the image sharp.
This kind of behavior suggests a low power endian encoding. We call it low power endian because the frequent values of low digits reserve the few bit flip codes in a serial stream. This means that zero comes first followed by one encoded as 00000001, and two encoded as 00000010. However, the next code will be 00000100 representing three by this rule, since it has just a single one. 00001000 will follow as four.
Bit flips represent power use in the processors used today. It is not the one but switching from one to zero or vice versa that leaks current generating heat and consuming your money.
Individual pixels may not have the zero as black as the most prevalent pixel as the mode or median. Green encoding defining low power for pixels encoded with a method like above may use first complement using the average, median, or mode pixel as the base for zero. Low bit flip numbers with many ones or many zeros will then surround this value reducing the bit flips processing the entire stream.
Obviously there are many ways to leverage hot spots with shuffled endianness.
Artificial intelligence chips may also use low power or green endian representing the weights of neurons based at zero. This may help to reduce data center power use, if very low and very high numbers are the most prevalent weights or values for neuron models. Green endian may be useful to encode LLMs tokens with encoding the most prevalent ones with few bit flips.
You may say, such an encoding does not help much to reduce your home electricity bill. Remember, images are processed in bulk in huge datacenters analyzing and generating artificial intelligence models. LLMs training can cost up to tens of thousands of dollars. Changing the endianness of such large usage may already have impact at scale. It is not as much energy saving as a fusion reactor, but it may be worthwhile to try.
The green endian codepage follows.
000 00000000
001 00000001
002 00000010
003 00000100
004 00001000
005 00010000
006 00100000
007 01000000
008 10000000
009 00000011
010 00000101
011 00000110
012 00001001
013 00001010
014 00001100
015 00010001
016 00010010
017 00010100
018 00011000
019 00100001
020 00100010
021 00100100
022 00101000
023 00110000
024 01000001
025 01000010
026 01000100
027 01001000
028 01010000
029 01100000
030 10000001
031 10000010
032 10000100
033 10001000
034 10010000
035 10100000
036 11000000
037 00000111
038 00001011
039 00001101
040 00001110
041 00010011
042 00010101
043 00010110
044 00011001
045 00011010
046 00011100
047 00100011
048 00100101
049 00100110
050 00101001
051 00101010
052 00101100
053 00110001
054 00110010
055 00110100
056 00111000
057 01000011
058 01000101
059 01000110
060 01001001
061 01001010
062 01001100
063 01010001
064 01010010
065 01010100
066 01011000
067 01100001
068 01100010
069 01100100
070 01101000
071 01110000
072 10000011
073 10000101
074 10000110
075 10001001
076 10001010
077 10001100
078 10010001
079 10010010
080 10010100
081 10011000
082 10100001
083 10100010
084 10100100
085 10101000
086 10110000
087 11000001
088 11000010
089 11000100
090 11001000
091 11010000
092 11100000
093 00001111
094 00010111
095 00011011
096 00011101
097 00011110
098 00100111
099 00101011
100 00101101
101 00101110
102 00110011
103 00110101
104 00110110
105 00111001
106 00111010
107 00111100
108 01000111
109 01001011
110 01001101
111 01001110
112 01010011
113 01010101
114 01010110
115 01011001
116 01011010
117 01011100
118 01100011
119 01100101
120 01100110
121 01101001
122 01101010
123 01101100
124 01110001
125 01110010
126 01110100
127 01111000
128 10000111
129 10001011
130 10001101
131 10001110
132 10010011
133 10010101
134 10010110
135 10011001
136 10011010
137 10011100
138 10100011
139 10100101
140 10100110
141 10101001
142 10101010
143 10101100
144 10110001
145 10110010
146 10110100
147 10111000
148 11000011
149 11000101
150 11000110
151 11001001
152 11001010
153 11001100
154 11010001
155 11010010
156 11010100
157 11011000
158 11100001
159 11100010
160 11100100
161 11101000
162 11110000
163 00011111
164 00101111
165 00110111
166 00111011
167 00111101
168 00111110
169 01001111
170 01010111
171 01011011
172 01011101
173 01011110
174 01100111
175 01101011
176 01101101
177 01101110
178 01110011
179 01110101
180 01110110
181 01111001
182 01111010
183 01111100
184 10001111
185 10010111
186 10011011
187 10011101
188 10011110
189 10100111
190 10101011
191 10101101
192 10101110
193 10110011
194 10110101
195 10110110
196 10111001
197 10111010
198 10111100
199 11000111
200 11001011
201 11001101
202 11001110
203 11010011
204 11010101
205 11010110
206 11011001
207 11011010
208 11011100
209 11100011
210 11100101
211 11100110
212 11101001
213 11101010
214 11101100
215 11110001
216 11110010
217 11110100
218 11111000
219 00111111
220 01011111
221 01101111
222 01110111
223 01111011
224 01111101
225 01111110
226 10011111
227 10101111
228 10110111
229 10111011
230 10111101
231 10111110
232 11001111
233 11010111
234 11011011
235 11011101
236 11011110
237 11100111
238 11101011
239 11101101
240 11101110
241 11110011
242 11110101
243 11110110
244 11111001
245 11111010
246 11111100
247 01111111
248 10111111
249 11011111
250 11101111
251 11110111
252 11111011
253 11111101
254 11111110
255 11111111