Synthesis and comments on Cybernetics by N. Wiener

26 Feb 2021

Compte-rendu par chapitre

I. Newtonian and Bergsonian Time

p32. Newtonian time is defined as the time variable of the physical laws, defined as a “formal set of postulates and a closed mechanics”.

💡 The transformation of the variable into its negative has no effect on the mechanics. I believe it’s important to notice that this holds formally pretty obviously (after all, any purely formal theory will have the characteristics decided by its “designer”). It also still holds in the real world (when applying the theory) because of the fact we only consider phenomena from mechanics of closed systems. But, as we know, the 2nd law of thermodynamics will tell us that any non-isolated system will see its entropy increase, which wouldn’t allow a Newtonian time to be used by the scientist studying this system.

%causality

p33. He takes a step back:

Using the Newtonian laws, or any other system of causal laws whatever, all that we can predict at any future time is a probability distribution of the constrants of the system…

p34. He asks the question of “Why the unidrectional thermodynamics which is based on experimental terristrial observations stand us in such good stead in astrophysics?”. The explanation is said to be “not too obvious”, and I haven’t found it clear enough to be convinced. It goes along these lines: if we imagine a star that is doing the inverse of radiating light to us, we either couldn’t see it, either we would need to be doing our measurement process in time-reverse, which is impossible. Concluding: “Thus the part of the universe which we see must have its past-future relations, as far as the emission of radioation is concerned, concordant with our own”.

p34. More, he asks the question of an “intelligent being whose time should run the other way to our own”. This could happen, and we would see phenomena which would be strange (fortuitous, catastrophic) but still perfectly explainable. This person would think the same of us, and as a conclusion, we would be incapable of communicating with each other:

Within any world with which we can communicate, the direction of time is uniform.

This sounds sound, but I am neither very convinced by his demonstration. Let’s note that this precise conclusion is an important cornerstone of any further development in his book.

p35. He makes the link between the theory of evolution (Darwin) and theories based on statistics (like tidal waves). The overall idea is that real-world dynamical process (let’s say: non-closed?) see “fortuitous variability” (similar to Darwin’s mutations), converted into “patterns of development which [read] in one direction”.

p37. He further mentions that Plank and Bohr with their quantum theory have shown that statistical averaging happens even in the present (not only average through time, but through quantum probabilties). And thus, even Newtonian physics is “the average results of a statistical situation, and hence an account of an evolutionary process”.

💡❓ On the “evolutionary process” part, it seems he means something close to: time-bound evolution selects the best possible future (for a species to survive or for a wave to dissipate as much energy as possible?), but actually quantum-bound evolution selects the best possible present: see quote below. This sense is very appealing but would definitely need some further explanation and demonstration.

the complete collection of data for the present and the past is not sufficient to predict the future more than staistically

p38. Phagocytes the Bergsonian dichotomy between the reversible time of physics and the irreversible time of evolution, mentioning notably that the chance of the quantum theorician is not the ethical freedom of the Augustinian.

💡🔴 This sounds very much a determinist or mechanist argument, going in the metaphysical direction that he seems to follow throughout the book. He allows himself to say this following the assertion that he made earlier that the 2nd law of thermodynamics is a direct consequence of the statistical nature of statistical processes (notably friction forces, etc.). This would definitely need to be further validated, notably in view of how scientists have until now managed to validate the 2nd law of thermo.

p38. He then mentions that the real first industrial revolution was actually Huyghens and Newton, the clockwork and engineering of navigation. The new revolution is the “age of communication and control”. Notably its main characteristic (making a difference with previous industrial revolutions): accurate reproduction of signal.

p40. He mentions the Occasionalists, Spinoza and then Leibniz who really found a way to describe the relation of mind and matter in a dynamical way. The monad is the intellectual device that allows to think of the world as a continuum of “Newtonian solar system[s] writ small”, from matter to mind.

p42. Mentions that wasting energy is a parallel of sending signal (electronic tube). Starts to make the link with the human body, mentioning automata, that need to be coupled with the outside world (through messages, information processing). What matters here, according to him, is the communication engineering: message, noise, information, coding… He thus sees the 20th century as “the age of servomechanisms as the nineteenth century was the age of the steam engine of the eighteenth century the age of the clock” (hasn’t he forgotten electricity?).

💡 Here we see an instance of an intellectual and scientific framewwork that he puts forward at other points in the book:

Make use of analogies, and especially analogies between the (a) natural phenomena that come into play for the construction of a scientific theory, (b) the devices built by man to test and apply these theories, (c) the humain body+mind about which functioning we would like to be able to reach conclusions.
Building synthetic systems that look like versions of the living enables to think “back” about the living and take scientific conclusions about how the actual living organisms work. Notably: the human mind.

💡🤔 We could say this is more of a framework than a tool, and maybe even a political stance, a program for mankind. This looks definitely like a progress-oriented stance, that adds up on the Industrial Revolution mindset to give mankind more and more control over nature (itself building upon the oldest ideologies about technique itself). The real philosophical change about this program is a new, important, addon: understand ourselves through science and technique.

%arrow-of-time %explainability %scientific-method

II. Groups and Statistical Mechanics

p45. Key idea of Gibbs: practice ≠ Newtonian dynamics, in that we can’t know all the initial velocities and momenta.

p46. To reach the right mathematical conclusions, Gibbs needed to work on inifinite sums of probabilites 0 to build probabilities >1 and distributions adding to 1. This needed Lebesgue work on functions as sum of series.

p47. Phase space: N degrees of freedom => N (position coordinates) + N (momenta) dimensions. Then, mentions invariants which allow reducing dimensions.

p49. Gibbs had the idea that time averages and space averages are the same, this is the ergodic hypothesis.

Gibbs himself thought that in a system from which all the invariants had been removed as extra coordinates almost all paths of points in phase space passed through all coordinates in such a space. This hypothesis he called the ergodic hypothesis.

Actually quasi-ergodic hypothesis: system passes indefinitely near any point in the phase space (limited by invariants).

p50. General considerations about science:

“For the existance of any science, it is necessary that there exists phenomena which do not stand isolated.”
“The essence of an effective rule for a game or a useful law of physics is that it be statable in advance, and that it apply to more than one case.”

Then builds up by talking about tranformation groups, group invariants, group character functions (stable by inversion and multiplication, character functions verifying f(Tx) = a(T)f(x) with |a| = 1), characterizing character group structure based on the original group structure.

p54. Ergodic theory: theory of metrical invariants of a group of tranformations. He shows it allows to constructs a justification of Gibbs interchange of phase averages and time averages. His construction is based looking at a transformation T that leaves invariant no set of points of measure ≠ 0 or 1, and the limit of character functions which avearge all ƒ(T^n(x)) with T^n moving in phase or time. These limit functions are shown constant and take the same value, which is the integral of ƒ (from 0 to 1).

p55. When transformations not ergodic, they can still be reduced to ergodic components.

p55. Entropy, taken from Thermodynamics = ln(probability(region in phase space)).
Thermal equilibrium (max entropy for temporal and volume) => we can talk about local temperatures. This works for thermal engines but not living matter where temperature is statistical.

p56. Maxwell demon: the demon is coupled to the box system, it needs information, which is negative entropy. Coupling is a coupling on energy, because of quantum mechanics. After some time, the demon has random motion (because of temperature) and loses correct perception.

💡 Concise and strong explanation about the Maxwell demon. Interesting to note the link between energy and information through the fact that correct perception is reduced by temperature.

In the same way enzymes & man are metastable. Real equilibrium is death.

The stable state of a living organism is to be dead.

III. Time Series, Information, and Communication

p60. Contextualization: the time series and their processing needed in case of some systems (telephone devices, gun pointers…) require automatic chain of operations, able to work on information itself.

p61. What is information? It is a decision, like the one of choosing Head or Tails.
In context of a quantity known to lie between 0 and 1, the perfectly precise measurement of it would be represented by an infinite binary number a = 0.a1 a2 a3 a4…. This makes one decision per binary digit, thus an infinite amount of decisions. Perfect precision corresponds to an infinite amount of information.

In real life, we have a measurement error ∆ = 0. b1 b2 b3…, where bk is the first non-0 digit. With such an error, the first digints of a until a[k-1] and maybe ak are significant, while the others a[n>k] are not significant. In this context, the number of decisions that are really taken in the representation of a amount to -log2(∆) or ≈ k (depending on the significance of ak). The amount of information is defined as = -log2(∆).

Let’s consider that we know a priori that a lies in [0, 1] and that a posteriori (depending on ∆) it lies in [x, y]. It is also useful to consider the amount of information as how much smaller the the measure of the interval [x, y] is compared to the measure of [0, 1]. The amount of information we have from our a posteriori knowledge is log2(|[a, b]|/|[0, 1]|) (|| being a measure).

💡 We see here that information ↗ with precision ↗ and with |[a, b]| ↘ (here we talk of precision of measure).

p62. A more generic version of the above: with a probability that a varable is in [x, x + dx] defined by f1(x)dx.

💡 We see that f1 is a precision.: where f1 is high, there is less place for the variable to be defined elswhere in the ℝ line.
Thus, log2(f1) is a good base for a measure of information. With log2(f1)f1dx we have normalized this using probability.

Thus ∫log2(f1)f1dx is total information (note: ∫f1 = 1)

p63. Trying to reach conclusions based on these definitions, through calculus.

The amount of information from independent sources is additive.

Determining the information gained by fixing one or more variables in a problem.
A priori distrib: u(x)dx = 1/√2πa exp(-xˆ2/2a) dx.
A posteriori, with u + v = w:

u(x)v(w-x)dx = 1/√2πa exp(-xˆ2/2a) 1/√2πb exp(-(w-x)ˆ2/2b) dx

Excess information is - old apriori + new aposteriori:

excess information = - ∫ulog2(u)du + ∫u(x)v(w-x)log2u(x)v(w-x))dx

We reach conclusions on the message versus noise question:

the information carried by a precise message in the absence of a noise is infinite. In the presence of a noise, however, this amount of information is finite, and it approaches 0 very rapidly as the noise increases in quantity.

p64. Relationship with thermodynamics, and specifically the 2nd law:

processes which lose information are, as we should expect, closely analogous to the processess which gain entropy.

No operation on a message can gain information on the average.

p66. Observation and messages

A set of observations depends in a arbitrary way on a set of messages and noises with a known combined distribution. We wish to ascertain how much information these observations give concerning the messages alone.

p67. Birkhoff ergodic theorem applied to time series in statistical equilibrium & information of statistical parameters:

Time recording of time series <=> Phase space compute set of stat params of ensemble in stat equilibrium to which time series belong

given the entire history up to the present of a time series known to belong to an ensemble in statistical equilibrium, we can compute with probable error zero the entire set of statistical parameters of an ensemble in statistical equilibrium to which that time series belongs.

(an) Fourier params calculated from time ∫[e^t t^n f(t)dt]t<0 => Knowledge (distribution) of A(t) defined for t>0 (in the future).

We can obtain the prediction to meet any desired criterion of goodness.

Knowledge of the past => Compute amount of information we have of the future beyond a certain point t1 > 0.

p70. Study of Brownian motion based systems and their prediction.

Brownian motion: mean square motion in direction proportional to length of time, motions time-uncorrelated.

0    -> t1: ∆x1
tn-1 -> tn: ∆xn

Probability that particles lie in [x1, x1+dx] at t1 & … & [xn, xn + dx] at tn

= exp[-x1^2/2t1 - (x2 - x1)^2/2(t2 - t1)…]/√[(2π)^n t1(t2 - t1)…] dx1dx2…

Set of paths correponding to the ≠ possible Brownian motions can depend on param α ∈ [0, 1] so that each path is a function of x(t, α) and where the “probability that a path lies in a certain set S is the same as the measure of the set of values of α corresponding to paths in S”.

He proves:

∫0->1[x(t1, α)x(t2, α)…x(tn, α)]dα = ΣΠ∫0->1[x(tj, α)x(tk, α)]dα

when we know the averages of the products of x(tj, α) by pairs, we know the averages of all polynomials in these quantities, and thus their entire statistical distribution.

p72. We look at ∫K(t)dξ(t, γ), K any function with ξ running t on ℝ with

ξ(t, γ) = ξ(t, α, β)
ξ(t, α, β) = x(t, α) if t >= 0
ξ(t, α, β) = x(-t, β) if t < 0

γ being a mapped from ℝ² onto ℝ (via diagonalisation).

He proves that, of time series f(t, γ) = ∫K(t + τ)dξ(t, γ), all statistical parameters depend on the autocorrelation function with lag τ of K, and thus it is in statistical equilibrium.

Autocorrelation function: Φ(τ) = ∫K(s)K(s + τ)ds

We can thus show that for any bounded measurable functional (quantity depending on the netire distribution of the values of the function of t), we have

lim σ->∞ ∫0->1[ℱ(f(t, γ))ℱ(f(t + σ, γ))]dγ = {∫0->1[ℱ(f(t, γ))]dγ}²

and by ergodic theorem:

∫0->1[ℱ(f(t, γ))]dγ = lim T->∞ 1/T ∫-T->0[ℱ(f(t, γ))]dt

Thus

we can almost always read off any statistical parameter of such a time series, and indeed any denumerable set of statistical parameters, from the past history of a single example.

p76. “Try to build up a time series as general as possible from the simple Brownian motion series.”

Time series in the form:

∫a->b [exp(i ∫K(t + τ, λ)dξ(τ, γ))] dλ

This is

a first step toward the solution of the problem of reducing a large class of time series to a canonical form, and this is most important for the concrete formal application of the theories of prediction and of the measurement of information

p78. To further generalize

the question is: under what circumstances can we represent a time series of known statistical parameters as determined by a Brownian motion(…)?]

He concludes that this is a research program which “offers the best hope for a rational, consistent treatment of many problems associated with non-lineat prediction…”.

p80. Going back to the prediction problem for time series ∫K(t)dξ(t, γ).

He shows that the past and present of the differential dξ(t, γ) determines the past and present of the time series, and conversely.

He then builds best prediction operators in context of message + noise.

Wave filters with lag a are used to build the “best representation” of m(t + a) (written frequency scale) (m is the message).

p86. Rate of transmission of information in case of messages and noises derived from the Brownian motion.

He calculates the total amount of information avialable concerning the distribution of the message, then the rate of transmission of information. It depends not only on the width of the frequency band available for transmitting the message but also on the noise level.

p88. Concludes on the “theory of messages depending lineraly on the Brownian motion”:

it gives the best possible design of predictors and wave filters in case message and noise represent the response of linear resonators to Brownian motions
in much more general case, they represent a possible design for predictors and filters.

p88. Mentions multiple time series: more complex.

p89. Mentions discrete time-series:

by a process of step-by-step predicition, we can solve the entire problem of linear prediction for discrete time series.

The filters for discrete time series are usually not so much physically constructible devices to be used with an electric circuit as mathematical procedures to enable statisticians to obtain the best results with statistically impure data.

p92. Caveat: the statistical theories here involve a full knowledge of the pasts. To go further, we have to extend existing methods of sampling, which most probably requires Bayes’ law.

p92. Discussion on quantum mechanics. Progression:

Newtonian physics: the sequence of physical phenomena is completely determined by all positions and momenta at any one moment.
Gibbsian theory: with a perfect determination of the multiple time series of the whole universe the knowledge of all positions and momenta at any one moment would determine the entire future.
Heisenberg: “time series can in no way be reduced to an assembly of determinate threads of development in time”.

In quantum mechanics, the whole past of an individual system does not determine the future of that system in any absolute way but merely the distribution of possible futures of the system.

Classical physics worked “over the range of precision where it has been shown experimentally to be applicable”.

Mentions that high wavelength light waves allows high precision measurement of a system but will subject it to a change in momentum.

p93. Asserts that this theory is still applicable to the theory of entropy.

A system will transform itself in the course of time into any other state, but the probability of this depends of the relative probability of measure of the two states. The probability is high for states which can be transformed into themselves by a large number of transformations, or having high internal resonance.

High internal resonance allows for more stability.

Haldane: this is maybe how genes and viruses reproduce themselves. He notes that there is no sense in mentioning which gene is a copy or the master version, as there is no such thing as perfectly sharp individuality on a quantum level.

Szent-Györgyi: substances with high resonance have an “abnormal capacity” for sotring both energy and information.

IV. Feedback and Oscillation

p96. Feedback, chain of feedback, negative feedback

Our motion is regularted by some measure of the amount by which it has not yet been accomplished.

p97. Output of the effector. In this section, he talks about linear effectors, and their problems.

p98. Pieces of apparatus which delay inputs => ƒ(t - τ)

We can approx ∫0->∞ ak ƒ(t - τk) dτ (“Operator”)

💡 A small note on non-determinism in text:

… a streetcar which may turn off one way of the other at a switch, which is not determined by its past.

The expression above is independent of a shift of the origin of time and linear. All such operators of the past have this form, or limit of a sequence of.

p100. Look at ƒ(t) = exp(zt)

=> Operator is exp(zt) ∫0->∞ a(τ)exp(-zt) dτ = exp(zt) A(z)

A(z) is the representation of the Operator as a function of frequency.

Remarks on the mathematical content:

|A(x + iy)| <= √[1/2x ∫0->∞ |a(τ)|² dτ ]

=> A is bounded in every half-plane x >= ε > 0 with A(iy) the boundary values.

💡 Indeed, x ->+ 0 defines the “max boundary” ∀ x>0 as per the expression above, and this is A(x + iy) ->+ A(iy).

p101. With A(x + iy) = u + iv and notably x=0, study the looks of the boundary.

🌅 See figure page 101. The idea here is that interior points are reached taking the normal on the right of the line drawn by A(iy), without crossing the line again.

💡 The author notes that interior points correspond to possible values of A(x + iy), x>0. Not sure what is the link between right normals to A(iy) and this.

p102. Control flow chart of such a system.

🌅 See figure page 102.

Y = X - λAY with X the input, Y the input to the motor, the motor operator is A, and multiplier operator is λ.

The motor output is A Y = A/(1 + λA) X, A/(1 + λA) being the operator with a diagram of

u + iv = A(iy)/(1 + λA(iy))

∞ is an interior point iff -1/λ is an interior point of A(iy), which corresponds to unrestrained and increasing oscillation.
If -1/λ is an exterior point, the feedback is stable.
If -1/λ is on the boundary, it’s more complicated, but overall there will be an oscillation with an amplitude which does not increase.

p103. The author examines a series of different operators to study their feedback range.

If A(z) = z, A(iy) is the bottom-up directed ordinate line, thus all the right ℂ plane is interior. -1/λ is always exterior, thus any feedback is ok.

If A(z) = u + iv = 1/(1 + kz), it’s equivalent to u² + v² = u: the circle with radius ½ and center at (½, 0) described clockwise, thus with interior points interior to the circle. -1/λ is always exterior, so any feedback is ok.
Note: corresponding a(t) is a(t) = exp(-t/k)/k.

If A(z) = (1/(1 + kz))², it’s (in polar) √ρ = - sin ϕ/2 or √ρ = cos ϕ/2 which is a cardiod clockwise, this with interior point being interior. All feedback is ok.

If A(z) = (1/(1 + kz))³, it’s (in polar) ∛ρ = cos ϕ/3: 🌅 see figure page 105. -1/λ is interior iff λ > 8.

If A(z) = exp -Tz a delay in time, then u + iv = exp -Tiy = cos Ty - isin Ty a clockwise unit circle centerd on origin. -1/λ is interior iff -1/λ > -1 or λ > 1. The limit of feedback intensity is thus 1.

As a conclusion, we see that we can compensate for an operator 1/(1 + kz) by an arbitrarily heavy feedback, so as to get A/(1 + λA) as near to 1 (not 1/λ?) as we wish for all frequencies.
Considering 1/(1 + kz) as an operator and (1/(1 + kz))ˣ as a composition of x of them, we see that we can compensate one operator with one feedback, even two operators with one feedback, but 3 operators will need at least 2 feedbacks.

p106. Ship steering stabilizing system using gyrocompass: this yields a (1/(1 + kz))³ kind of operator, thus “no servomechanism whatever will stabilize the system”.

We can achieve stabilization by using another feedback which would be, for example, the difference between the actual course and the angular position of the rudder.

p108. Link with physiological cybernetics:

One of the great tasks of physiological cybernetics is to disentangle and isolate loci of the different parts of thie complex of voluntary and postural feedbacks.

💡 We might appreciate the “disentangle and isolate” approach, which is a direct application of rationality by practicing analysis. And this is precisely what is also causing moral problems when this starts to be not only analysed (disentangled…) but also acted upon.

p108. Effects of heavy feedback

When feedback is possible and stable, its advantage, as we have already said, is to make performance less dependent on the load.

↗️ negative feedback (if stable) => ↗️ stability of the system for low frequencies but ↘️ stability for some high frequencies.

p108. Incipient osciallation: corresponds to y with A(iy) on the boundary with u most on the left (negative).

p109. Conclusion on the previous analysis of linear osciallating systems.

These linear oscillating systems nearly always oscillate in the form A sin (Bt + C) exp Dt.
If there is a periodic non-sinusoidal oscillation, “it is always as suggestion at least that the variable observed is one in which the system is not linear”.

Also, for linear oscillations, the amplituode of oscillation is independent of frequency.
Whereas for non-linear osciallations, there is a discrete set of amplitudes for which the system will oscillate at a given frequency, and only a discrete set of such frequencies.

Example: organ pipe. This can be described by a relaxation oscillation: solution periodic in time and determinate in amplitude and frequency but not in phase.

Some systems are non-linear but can be studied as linear when non-linear terms are mostly constant over a period: theory of secularly perturbed systems.

p110. Non-linear systems of relaxation oscillation: well studied when differential equations of low order. But not well studied at the time: integral equations when system depends for its future behavior on its entire past behavior. The author sketches a solution with an infinite system of linear non-homogeneous differential equations.

p111. Competition between feedback systems of control (this chapter) and compensation systems (previous chapter). “Both serve to bring the complicated input-ouput relations of an effector into a form approaching a simple proportionality”.

The feedback system has a performance relatively independent of the characteristic and (changes of) of the effector used. One should select the method based on the constancy of the characteristic of the effector.

Study of the cases where it’s advantageous to combine both methods.

🌅 See fig 4 page 112: there is a Compensator before the Substractor, which can compensate the average characteristic of the feedback system.

🌅 See fig 5 page 112: the Compensator is put right after the Substractor (one larger effector with the Effector). In general, this affects the max feedback, but for the same feedback level will improve the performance of the system. For example: if Effector has a lagging characteristic, the Compensator will be a predictor.

p113. Makes the link with human/animal reflexes.

Duck shooting error minimization is anticipatory feedback.

Steering of a car on an icy road: control vy informative feedback.
🌅 See fig 6 page 114: this one can be modeled with a high-frequency oscillator (briging information), and a Compensator that explores amplitude-phase relations of the high-frequency output to the input.
Advantages of this type of feedback: “the Compensator may be adjusted to give stability for every type of constant load”. Secular load change: OK (like gun turret friction that goes up very slowly).
“This informative feedback will work well only if the characteristics of the load at high frequencies are the same as, or give a good indication of, its characteristics at low frequencies.”

p114. Link with homeostasis.

our inner economy must contain an assembly of thermostats, automatic hydrogen-ion-concentration controls, governors, and the like, which would be adequate for a great chemical plant. These are what we know collectively as our homeostatic mechanism.

Homeostatic feedbacks are slower than voluntary and postural feedbacks:

nerve fibers: para-/sympathetic systems (non-myelinated, slow) + effectors: smooth muscles and glands (slow)
or non-nervous channels: slower modes of transmission.

V. Computing Machines and the Nervous System

p116. Recording numbers: use a uniform scale (ie base).

Amount of information: I = log₂ n,
cost of recording information: (n - 1) A = (2^I - 1) A with A constant.

Divided among N scales: N (2^[I/N] - 1) A.

It is shown that minimum of the cost is reached with N = ∞.
To get N as large as possible and keep 2^[I/N] an integer, we use I/N = 1.

p117. The binary system:

in which all that we know is that a certain quantity lies in one of the other of two equal portions of the scale, and in which the probability of an imperfect knowledge as to which half of the scale contains the observation is made vanishingly small

v = v0 + 1/2 v1 + 1/2² v2 + … + 1/2ⁿ vn + …

with vn in {0, 1}.

p117. Different types of machines:

Analogy machines: data are represented by measurements on some continuous scale.
Numerical machines: data are represented by a set of choices among a number of contingencies. Which will be more accurate.

Mentions that in a chain of computations, it is the slowest which gives the order of magnitude of the entire system. Thus, remove the human from the chain and perform all intermediate processes on the binary scale.

p118. Algorithms for combining contingencies: those are the rules for combining the numerical data in input.

The most obvious one is Boolean algebra, which is considered superior to other systems on the same bases than superiority of binary arithmetic.

💡 Of course this is true only when minimizing cost of recording of information, not when considering for example the epistemological questions of what corresponds most to human experience.

All is just ⑃:

Thus all the data, numerical or logical, put into the machine are in the form of a set of choices between two alternatives, and all the operations on the data take the form of making a set of new choices depend on a set of old choices.

p119. Decribes basically a CPU architecture: bank of relays, with conditions “on” and “off”, with positions dictated by the positions of others at a previous stage. And with a central clock.

Then introduces memory: “special apparatus to retain an impulse which is to act at some future time”.

p120. Nerous system: neurons are either firing or reposing, mostly based on input messages from other neurons through synapses (from few to a 100s).

Explains simplified firing with a clock (delay) and a treshold: if inputs before delay are above threshold, then the neuron will fire at end of current interval of time.

Memory of the nervous system: “preserve results of past operations for use in the future”.

p121. Brain vs machine:

the brain, under normal circumstances, is not the complete analogue of the computing machine but rather the analogue of a single run on such a machine.

p122. Memory challenges: “difficult to achieve a considerable time lag”. For example, use elastic vibrations. But before the cumulative deformation of the message becomes too big, “trigger off a new message of prescribed form”, like in telegraph type repeaters.

Condensers look like a good solution in some cases.

For more permanent records, lists solutions (magnetic tape, phosphorecent substances, photography).

Notices that methods of storage of information share a physical element in common: they depend on systems with a high degree of quantum degeneracy, a large number of modes of vibration of the same frequency.

Quantum degenreacy appears to be associated with the ability to make small causes produce appreciable and stable effects.

Notes that in case of the neural system, long-term memory is a more permanent change like treshold change through permeability of synapses, …

Notes that if the chief change of tresholds in the memory process are increases, then “the capital stock of power to live might decrease”. This might be the cause of senescence.

💡 Probably not true, to double check.

p124. Look at the light cast on logic by such machines.

The science of today is operational; that is, it considers every statement as essentially concerned with possible experiments or observable processes. According to this, the study of logic must reduce to the study of the logical machine…

This might reduce logic to psychology. On their difference:

All logic is limited by the limitations of the human mind when it is engaged in that activity known as logical thinking.

For example, “No admissible proof involves more than a finite number of stages”. He mentions mathematical induction (which may span an infinite number of cases), which is different from complete induction over an infinite set (which is impossible).
To prove Pn for all n, we need a single argument independent of n (mathematical induction).
These things are studied in metamathematics, discipline built by Gödel.

p126. “A logical machine following definite rules need never come to a conclusion”.

There he makes a parallel with paradoxes (like Cantor or Russel paradoxes), where the answer would oscillate between “yes” and “no”.

Mentions Russel’s solution using types, “attach a parameter to each statement, … being the time at which it is asserted”.

p126. What about the ability to learn? Consider two related notions: association of ideas and conditioned reflex. Based on Locke & Hume’s terms, ideas and impressions unite themselves into bundles based on similarity, contiguity and cause and effect.

Mentions that dynamics-thinking had not yet reached biology and physiology in the 18th century, it was rather collection-oriented, because there was “so much to explore”. Pavlov’s work is a good example of the first steps into dynamics.

p127. Affective tone

Pavlov has shown “union of patterns of behavior” (food shown together with another object to the dog), similar to Locke’s “union by contiguity”. But Pavlov is looking at visible actions, not introspective states of mind / ideas like Locke.

Affective tone is a central element of conditioned reflex: it is a scale from pain to pleasure.

an increase in affective tone favors all processes in the nervous system that are under way at the time

even the most suicidal apportioning of affective tone will produce a definite pattern of conduct.

🌅 Fig. 7 p.128: this shows parallel processes which produce an affective-tone mechanism, which in turn feed a “totalizer” which feeds back into processes.

When affective tone ↗️ , the feedback makes thresholds ↘️ . And inversely.

p129. We thus see that affective tone (only a model) is capable of learning.

Notes that the totalizer may just be a specific messaging system, like hormones which are efficient for “to whom it may concern” kind of messages (I would say publish-subscribe).

This shows that we should take into account hormonal transmission.

He notes, while being not sure of the scientific validity of this, that Freud’s theories put sex and memory very close to each other. Which doesn’t seem “absurd in principle”.

p130. Machines can potentially have conditioned reflexes.

Run of mechanical structure of the computing machine === life of the individual.

Idea for an artifical neuro-hormonal learning system: a message can change the grid bias of a number of vacuum tubes.

p130. More developed use of computing machines: solution of partial differential equations.

In case of non-linear PDEs, there is the idea that numerical calculations are going to create the data needed to start building a theory, instead of looking for them in nature:

as von Neumann has pointed out, we need them in order to form that acquaintance with a large number of particular cases without which we can scarcely formulate a general theory.

💡 It’s fun to see that we are talking AI some pages before, and here about the fact that numerical calculations are feeding theory modeling. If you make these two ideas a bit closer: on the AI end, this makes reinforced learning, and on the theory end, this makes… artificial worlds science?

p131. Sketches an even more precise computer architecture:

Make common oprations (add, multiply) as “standard assemblages” whereas less frequent would use these.
Component parts should be generic, not tied to other specific apparatus.
Allot components as needed through a kind of networking switch.

Makes the link with “traffic problems and overloading in the nervous system”.

p132.

The mechanical brain does not secrete thought “as the liver does bile”, as the earlier materialists claimed, … Information is information, not matter or energy.

VI. Gestalt and Universals

p133. Locke’s theory of association of ideas: three principles:

Contiguity.
Similarity.
Cause & effect, reduced by Locke and Hume to contiguity.

The author makes the hypothesis that, based on the previous chapter, there are neural mechanisms corresponding to these.

p133. Focus on similarity, in case of recognizing someone.

This involves a visual-muscular feedback system, which is abundant in animals, and can be largely superseded in terms of speed by an artifical mechanism thanks to electrical techniques.

2 eye-muscles feedbacks in man:

Of homeostatic nature, like pupil opening in the dark.
Of reflex nature, like putting an interesting object (moving, brilliant) in the fovea which is better for form and color.

For the 2nd case, the goal to make the image of the object vary “within as small a range as possible”. This is kind of an optimisation that “diminishes the number of neuron channels involved in the transmission of visual information”.

p135. The author discusses what we would call now feature specialization of parts of the neural process

Recognition of outline drawing: this seems to be explained by the fact that “somewhere in the visual process, outlines are emphasized” compared to other “aspects of an image”.

💡 These aspects are now called features in a ML model.

Retina is subjet to accomodation: contant stimulus => transmission of it ↘️ .
This allows not changing the character of an image which is stared at.

Mentions photography plaate treatments that increase contrasts, and thus allow repeating like in telegraph-type repeaters: trigger a new impression of standard sharpness from an image not too much blurred.

p136. The structure of the visual cortex is not using a highly generalized mechanism but corresponds to a permanent sub-assemly of specialized parts.

Consider the perspective transformations of an object, as a group (in math sense). Some of them (rotation, translations…) have continuously varying parameters. They “form multi-dimensional configurations in n-space, an contain sub-sets of transformations which constitute regions in such a space”.

Group scanning allows to traverse a net of positions in a one-dimensional sequence. We can also approximate any transformation as near as we wish through transformations of this sequence, as long as scanning is fine enough.
We consider regions of maximum dimensionality of regions transformed by the group under consideration.

Descibes a mechanizable process to identify the shape of a figure: scan and compare tranformation regions result with a fixed pattern and mark regions alike if these coincide (not clear here if it’s about comparing the result of transformations with the “front view” of an object or something along these lines).

p138. Modelisation of Gestats.

Notices that seems like relation, as described by the previous process, is not necessarily transitive: if A === B and B === C, we can still have A !== C.

The universal “ideas” thus formed are not perfectly distinct but shade into one another.

💡 I would like to say “computable” for anything that relates a mechanizable process.

The author introduces a process that allows to calculate some quantity Q for a set S of elements, tranformed by transformations T. If we integrate these quantities of the group measure (done with group scanning based on probability density of transformations), like ∫ Q(TS) dT, we have a quantity that “will be identical for all sets S interchangeable with one another under the trasnformations of the group, that is, for all sets S which have in some sense the same form or Gestalt”.

p139. Notices that sound-enabled reading for blind ("prothesis of one lost sense by another") is possible based on the “visual Gestalt” processes like page alignment, traversing from one line to anohter…

Produces a tentative sound encoding of letters, a bit like Braille.

p140. Group scanning assembly.

Notices the problem of scanning letters by photocells, notably in terms of height. This needs transformation of the vertical dilation group.

🌅 Fig 8 describes such a mechanism by McCulloch: photocells, leads, oscillators and connections.

IT represents a type of device “usable for any sort of group scanning”.
It is suggested that the fourth layer of the visual cortex might work (or at least be modeled) that way.
Also, ear transposition of fundamental pitch to the other is a translation of logarithm of frequency: might be perfomed by such a device as well.

the group-scanning assembly is well adapted to form the sort of permanent sub-assembly of the brain…

💡 It’s kind of a neural network architecture in which layers are able to organise so as to build interesting transformations (in the sense they are related to a given goal, training) that isolate features.

p141. Performance of the scanning apparatus.

The order of magnitude is the time required to make a direct comparison of shapes of objects different in size.

Seems to be around: 1/10th of a second, which accords well with the time needed to stimulate all the layers of connectors.

💡 Interesting to see the relationship with medical/physiological experimentation, and the way he does compare results of such experiments with the model to justify that the model is accurate.

p141. Widespread synchronism and clocking mechanism in the cortext.

there is a widespread synchronism in different parts of the cortex, suggesting that it is driven from some clocking center.

And the frequency is on the same order as the alpha rhythm of the brain.
Alpha rhythm is compared to the scanning rhythm of a television apparatus, but for group scanning.
This rhythm is most marked in meditation, but changes function (carrier) during concentration, and disappears in sleep…

p142. Concludes that sensory prothesis is possible.

memory and association areas (…) are available to store impressions gatherd from other senses than the one to which they normally belong.

Thus, a blinded man (not congenital blind) keeps a part of his normal visual mechanism. But he loses the fixed assembly part.
Prothesig this would need not only artificial visual receptors but also an artificial visual cortex.

💡 That’s the program for Neuralink basically. We can note the hardware vs software distinction, and we can even say that this is a strong methodological accomplishment of the author to build an understandable distinction between what is hardware and what is software.

Compares vision and audition quantitavely in terms of the different auditory “patterns at the cortical level”, and takes the shortcut to compare areas of the two parts of the cortex.
Vision:audition is 100:1. So recovering full audition with vision would yield still 95% vision, but the contrary would yield 10% vision (non-linear, based on distance at which a resolution pattern is reached).

VII. Cybernetics and Psychopathology

p144. The brain is considered a computing machine, at least because this comparison/lens should help with psychopathology and maybe psychiatry.

p145. Method of checking is discussed. A good one is to “refer every operation simultaneously to two or three separate mechanism”.

Then, the majority report is accepted by the collation mechanism.
Minority reports are indicated through a signal consisting of where and how they differ from majority.

Lewis Caroll in The Hunting of the Snark: “what I tell you three times is true”.

p146. Functional disorders: they are not based on physiological or anatomical issues.

Notes again that the (adult) brain is not the empty physical structure of the computing machine that corresponds to it, but

the combination of this structure with the instructions given it at the beginning of a chain of operations and with all the additional information stored and gained from outsdie in the course of this chain.

Difference between circulating memories and long-term memories.

We are pretty sure we can’t reconstruct the ideational content which is recorded out of chains of neurons and synapses when the brain is dead. Figuring a treshold for a synapse after death looks difficult.

Alteration of synaptic tresholds could be the cause of functional disorders. Even paresis.

p147. Link between the specious present and anxiety neuroses.

Memories of the specious present normally dissipate. But some eat-up the neuron pool.
This second case is a malignant worry.

This will eat-up short-term mental capacity. But, worse, permanent memory might get involved. The “ordinary mental life” can thus be destroyed.

Some similar pathological processes happen in machines. Mentions never-ending, circular processes.
Such contingencies happen due to highly improbable configurations.

💡 As if ending up in a Langton’s ant’s highway. It is an extremly highly complex mathematical problem to figure which situations end up in highways.

p148. Clearing.

Solutions to this: clear the machine of all information, shake it, send large electric impulses, disconnect an erring part of apparatus. All of this in hope that the circular process stops.

Most efficient but irreversible is death.

Closest non-pathological: sleeping. “Sleep over it!”.

Prefrontal lobotomy, in vogue at the time. Maybe more in use because easing custodial care.
But this is actually working to remove a malignant worry, by “damaging or destroying the capacity for maintained worry”, otherwise called conscience.
Limits all access to circulating memory.

Let me remark in passing that killing them makes their custodial care still easier.

Less deleterious: shock treatment. Can damage memory (on purpose).

p149. On deeper-seated permanent memories. there seems to be no pharmaceutical or surgical weapon against this.

That’s where psychoanalysis and other psychotherapeutic measures come in.
These techniques are based on “the concept that the stored information of the mind lies on many levels of acessibility an dis much richer and more varied than that which is accessible by direct unaided introspection”.

p150. Mentions limits in systems (based on D’Arcy Thompson):

each form of organization has an upper limit of size, beyond which it will not function.

In telephone switching systems, the high number of stages implies that, until traffic increases to the critical point, the system functions well, and breaks down right after.
Indeed, probability of success is p^1/n with p probability of success at each stage. p ↘️ => quality ↘️ ↘️ .

Same goes for Man:

is then likely to perform a complicated type of behavior efficiently very close to the edge of an overload, when he will give way in a serious and catastrophic way.

Due to excess in amount of traffic, physical removal of channels , excessive occupation of channels (like pathological worries).

Long neuron chains => more mental disorders.

Size of brain is cube of linear dimensions, but connections is square.
With gyrus and sulci, the brain is optimising for short-distance, at the expense of longer-distance communication.
Thus, in a case of a traffic jam, “the processes involving parts of the brain quite remote from one another should suffer first”.

The higher processes deteriorate first in insanity.

💡 That’s evidence for the cybernetic view of the brain.

The cerebral functions are not distributed evenly over the two hemispheres, and one of these, the doinant hemisphere, as the lions’ share of the higher functions.

p153. Hemispheres

Damages in the non-dominant hemisphere have very small consequences on higher functions. For example, Pasteur had his right brain basically dead, but did “some of his best work” after the injury.

On forced educational change of handedness (for left-handers), “in very many cases these hemispheric changelings suffered from stuttering and other defects of speech, reading, and writing…”.
One possible explanation:

With the education of the secondary hand, there as been a partial education of that part of the secondary hemisphere which deals with skilled motions, such as writing. Since, hoever, these motions are carried out in the closest possible asssociation with reading, speech and other activities which are inseparably connected with the dominant hemisphere, the neuron chains involved in processes of the sort must cross over from hemisphere to hemisphere and back…

Knowing that cerebral commissures (between hemispheres) are so few, a huge traffic jam then happens.

the human brain is probably too large already to use in an efficient manner all the facilities which seem to be anatomically present.

we may be facing one of those limitations of nature in which highly specialized organs reach a level of declining efficiency and ultimately lead to the extinction of the species.

VIII. Information, Language, and Society

p155. Leviathan of Hobbes: Man-State made up of lesser men => Leibniz (monads?) living organism a plenum with smaller organisms with their own life.

Anticipation of the cell theory.

The community of social animals (including Man and bees), looks like an individuality.

All the nervous tissue of the beehive is the nervous tissue of some single bee.

p156. Intercommunication of the members of a society/community.

Hormones is an example (sexual hormones).

About non-verbal communication (with a savage), “a signal without an intrinsic content may acquire meaning in his mind by what heobserves at the time”.

Thus social animals may have an active, intelligent, flexible means of communication long before the development of language.

Distinguish amount of information available to the race from available to the individual.
Racial informaiton is the one that changes an individual to act differently in a recognizable manner by other members of the race.

p158. Group (of individuals).

Can be characterised by autonomy (measured by number of decisions inside vs outside) and effective size (size to achieve some degreee of autonomy).

Makes a clear separation between individual and race-level information.
Makes a criticism of Bush’s Memex, since according to him we would need a human with gigantic knowledge to perform comparisons between material (books). [Looks like bad judgement]

p158. Homeostasis and the lack of it.

The body politic laks efficient homeostatic processes.
Which is similar to higher business life, politics, diplomacy and war, in which there is no homestasis: parties look for their own interest, sometimes form coalitions but only to be soon betrayed.

We are involved in the business cycles of boom and failure, in the succession of dicatatorship and revolution, in the wars which everyone loses, which are so real a feature of modern times.

Considers these social constructs, like business, as games, which fall under game theory (Von Neumann and Morgenstern).
But players of real-life games are are not totally rational players as in game theory.

Real-life games, “with the common man as their object”, are making him a loser (tricking him into voting, buying…).

But, for the author, closely knit communities that have been there for long enough are able to reach a good level of care and management of the issues of society: the “have a considerable measure of homeostasis”.

💡 This is probably not true in a rapidly changing world & society, is it?

Also, the author points out that he most effective and important anti-homeostasis factor of society is when “Lords of Things as They Are” (who want to play the game and gain power) control means of communication (notably: reduce private criticism, etc.).

One of the lessons of the present book is that any organism is held together in this action by the possession of means for the acquisition, use, retention and transmission of information.

p161. Power as a threat for homeostasis in society.

Notably, pervasiveness of business in media.

we have a triple constriction of the means of communication: the elimination of the less profitable means in favor of the more profitable; the fact that these means are in the hands of the very limited class of wealthy men, and thus naturally express the opinions of that class; and the further fact that, as one of the chief avenues to political and personal power, they attract above all those ambitious for such power.

Makes the point that the larger community is sometimes more stupid than the individual. And sometimes is less intelligent than the smaller community, notably in terms of proper sharing of information because of power agents (see above).

The author is pretty pessimistic that “anarchy of modern society” can be resolved.

p162. Extending methods of the natural sciences to social sciences is not the solution.

People believing this show “excessive optimism, and a misunderstanding of the nature of all scientific achievement”: we can’t isolate well social phenomena.

Loose coupling is well achieved with phenomena involving stars or molecules (maybe “mass effects” for the last ones, but not significant).

It is in the social siences that the coupling between the observed phenomenon and the observer is hardest to minimize.

Traduttore, traditore: there is impact on and reduction of who you study.

the social scientist has not the advantage of looking down on his subjects from the cold heights of eternity and ubiquity.

💡 Basically saying that real self-objectification is impossible.

Gives the example that social scientists and philosophers are always reducing their field of study to what relates to them: people around him, durations limited to a lifespan, ideas of the moment.

in the social sciences we have to deal with short statistical runs, nor can we be sure that a considerable part of what we observe is not an artifact of our own creation. We are too much in tune with the objects of our investigation to be good probes.

Too small data compared to hard sciences.

There is much which we must leave, whether we like it or not, to the un-“scientific,” narrative method of the professional historian.

💡 Ask an AI to study us?

IX. On Learning and Self-Reproducing Machines

p169. Assertion that power to learn (ontogenetic learning) is similar to power to reproduce (phylogenetic learning) in living systems.

p170. What about man-made machines? Yes, they can learn and reproduce.

p170. About playing games.

There is the Von Neumann theory: how to build a “complete strategy”, working state by state backwards from the end (winning) move.
Works only for very simple games (as complexity explodes).

There is also the statistical approach: look at past actions.
Works well with strategic games like war, checkers and chess.

A machine could work with that on chess for example, by examining previous games recorded.
That might need structuring the learning process in several steps. 💡 This is basically a preview of machine learning engineering.

p173. About linearity of feedback.

Mentions first-order programming (linear feedback) and second-order programming (more extensive use of the past and non-linear feedback).

🔴 Are the terms really used in the previous chapters?

Mentions Watanabe Information Theoretical Analysis of Multivariate Correlation (1960) which is about finding the most elegant solution to a geometrical problem.

p174. Theory of game-playing machines, applied to the “activity of struggle”.

💡 Again an instance of (swiftly) applying a theory as a prism to understand and manipulate the world. In some way, the “way of the engineer”?

Mentions how it can help analysing mongoose vs cobra: the cobra only does single actions one after the other (kind of a first-order feedback machine?) whereas the mongoose’s action “involves an appreciable, if not very long, segment f the whole past of the fight” (second-order feedback? at least we see clearly that there is a lower-frequency feedback).

…it must be remembered that the bullfight is not a sport but a dance with death, to exhibit the beauty and the interlaced coordinatng actions of the bull and the man.

p175. On the use of learning machines (mechanization) in real games, notably war.

Asserts that WWIII biggest danger maybe comes from the “unguarded use of learning machines”. Notably, the problem of not being able to turn them off:

To turn a machine off effectively, we must be in possession of information as to whether the danger point has come.

But the checker-playing machine can defeat its programmer.
💡 Maybe not a strong argument. Rather see Hamming lectures.

p176. Discusses the history of the dangers given to machine, in older tales in the form of the problem of magic, talking about “the moral situation of the magician”.

Mentions Goethe’s “The Sorcerer’s Apprentice”, Arabian Nights’ genie and W. W. Jacobs’ fable of the monkey’s paw.

In all these stories the point is that the agencies of magic are literal-minded ; and that if we ask for a boon from them ,we must ask for what we really want and not for what we think we want. The new and real agencies of the learning machine are also literal-minded.

We cannot expect the machine to follow us in thos prejudices and emotional compromises by which we enable ourselves to call destruction by the name of victory. If we ask for victory and do not know what we mean by it, we shall find the ghost knowing at our door.

p177. Self-propagating machines: creation of a replica capable of the same functions.

Is this possible combinatorially-speaking? Can we build a machine with complex-enough to do that? Yes according to J. von Neumann.

What about the operating procedure for building them? Introduces the non-linear transducer which is a very general approach.

Notes that non-linear transducers can be built with a linear combination of smaller non-linear parts, which allows for some learning techniques to apply (least-squares for example). And more, we the author shows that “we can imitate any unknown non-linear transducer by a sum of linar terms”, thanks to the ergodic property. Devices exist for all of the operations needed to produce such apparatus.

To determine the coefficients for the linear parts, we can use feedback apparatus.

What we have succeeded in doing is to make a white box which can potentially assume the characteristics of any non-linear transducer whatever, and then to draw it into the similitude of a given black-box transducer (…) without any intervention on our part.

Compares this “philosophically” to the way genes are able to produce other molecules of the same gene with at their disposal “an indeterminate mixture of amino and nucleic acids”. Or viruses.

X. Brain Waves and Self-Organizing Systems

p181. This chapter is about a specific self-organizing system where non-linear parts are central: the self-organization of electroencephalograms or brain waves.

p181. Historical overview of electrophysiology which is about electrical potentials in nervous systems.

This science was slow as start because instruments were “not good enough to record small electrical potentials without heavy distortions”.

The new technique to accelerate this has been electronics: conduction of gases (vacuum tube) by Edison then conduction in vacuo (cathode-ray oscillograph).

Allow to follow the time-course of small potentials, between electrodes on the scalp.

p183. But the mathematical understanding of oscillation has been troublesome. At first, only the alpha rhythm were clearly detected (1/10th of a second).

The author has been then working on this (around 1930), using autocorrelation: time mean of ƒ(t + τ)ƒ(t) (or conjugate multiplication).

For that:

To record better, the use of magnetic tape with frequency modulation is necessary.
Delayed tape reading, using distance between playback heads: f(t) and f(t + τ)
Multiply the two using square-law rectifiers and linear mixers based on 4ab=(a+b)²-(a-b)².
Approximate averageing using resistor-capacitor network with time constant long compared with the duration of the sample.
Repeat the process for multiple τ.

🌅 Figure 9 represents autocorrelation measured for τ varying from 0 to 17 seconds. It shows an oscillation of wave between 0.1 and 0.2 seconds, with amplitude reducing when τ ↗️ .

Makes the point that this is similar to Michelson’s interferometer in which the intensity of the interferometer fringes gives autocorrelation (“except for a linear transformation”).

💡 We can allow ourselves to generally think of data as “except for a linear transformation”.
Indeed, linear apparatus is both easy to abstract-out mathematically but also in practical means because it is easy to build and commonly found in nature.
Thus, rather than manipulating data (for example the amplitude given by some interference fringes or by Wiener’s oscillator) we can directly manipulate its equivalence class of any other amount calculable with a linear transformation (or even, maybe, multiple of them).
For the question of determining the coefficients of the linear transformation, if they are unknown, simple feedback mechanism exist as mentioned above.

p186. How to obtain a spectrum of a brain wave from autocorrelation.

Writes the autocorrelation in a Fourier form: C(t) = ∫ exp(2πi⍵t) dF(⍵), F ↗️ called integrated spectrum of ƒ.

Putting aside some specific part of the spectrum, we have C(t) = ∫ exp(2πi⍵t) ϕ(⍵) d⍵, ϕ the spectral density. If ϕ is L₂, we have ϕ(⍵) = ∫ C(t)exp(-2πi⍵t) dt which will look like a peak around -10 and one other around 10, 0 otherwise (“10 cycles”).

Heterodyning method: using some basic transforms (shifting frequency distributions to have peaks around 0, getting the spectrum to the right and to the left of the central frequency at the distance ⍵…), we can build the spectrum.

The method is simpler for autocorrelations which are nearly sinusoidal. For that, we can take autocorrelations at regular intervals (0, 1/n, 2/n…), and average them in a certain way to remove the cosine component and keep only the sine. We only need to multiply with 1 or -1, which is easily practicable even with manual means.

Computers have started to be used, so that heterodyning is less needed.

p190. Results of harmonic analysis of brain waves.

🌅 Fig 11. Resulsts of the harmonic analysis of the autocorrelation function of Fig 9.
Peak around 8.9 then dip then 9.0, then drop at 9.05.

There is a strong suggestiong that the power in the peak corresponds to a pulling of the power away from the region where the curve is low.

After another electroencephalogram of the same subject four days later, the approx width of the peak is retained, as well as the form.

p192. Sampling problem, using integration in function space.

we shall be able to construct a statistical model of a continuing process with a given spectrum.

Such a model is enough to “yield statistically significant information concerning the root-mean-square error to be expected in bran-wave spectra”.

💡 All that matters for scientific investigation of a given object/process is to have a good enough approximation. For that, we use statistical models and consider their root-mean-square error.

x(t,⍺) ℝ, [0, 1] -> ℝ, representing one space variable of a Brownian motion. ⍺ a statistical distribution.

We look at ∫ϕ(t)dx(t,⍺).

If ℱ functional with ℱ[x(t,⍺)] function of ⍺ depending only on differences x(t₂,⍺) - x(t₁,⍺) (meaning it depends only on the evolution of x in time), Birkhoff’s ergodic theorem can be applied.

Examines the response of a linear transducer to a Brownian motion ƒ(t,⍺) = ∫K(t+τ)dx(τ,⍺).
Its autocorrelation (integral of ƒ(t+τ,⍺)ƒ*(t,⍺)) can have the ergodic theorem applied.

Calculates the spectrum and the samped spectrum where the autocorrelation is sampled over an averaging time: “the sampled spectrum and the true spectrum will have the same time-average value”.

Looks at the approximate spectrum, where integration of τ is done over [0, 20 seconds].
Then calculated the root-mean-square error of the approximate sampled spectrum, and ends up with 1/6.

p197. Physilogocial questions about the dip phenomenon in frequency.

a sharp frequency line is equivalent to an accurate clock.

Notes that other forms of control and computation apparatus, other than the brain, use clocks for the purpose of gating: combine a large number of impulses into single impulses.

For desirable functioning of the apparartus, messages must be stored then released simultaneously, and combined while they are still on the machine. Thus gating and clock.

Synaptic mechanism: numerous fibers incoming, one outgoing.
For example, the outgoing fiber fire only when the proper combination of incoming ones fire in a very short interval of time.
In all cases, a short combination period is essential to combine incoming messages.

p198. The central alpha rhythm of the brain.

There is experimental evidence of gating in the nervous system.
It is well known that there is a delay bewteen an incoming visual signal and consequent muscular activity.
The delay has been shown to be consisting of 3 parts: one constant, 2 others uniformly distributed around 0.1s.

It is as if the central nervous system could pick up incoming impulses only every 1/10 second, and as if the outgoing impulses to the muscales could arrive from the central nervous system only every 1/10 second.

💡 Again here the author uses an electric-apparatus model to describe the functioning of the brain. This is done in a “successful” or positive way, in the sense that it allows us next to build upon abstractions developed in the context of electricity theory and related technical apparatus.

The alpha rhythm can modified by several means, light or electrostatic induction notably:

If a light is flickered into the eye at intervals with a period near 1/10 second, the alpha rhythm of the brain is modified until it has a strong component of the same period as the flicker.

p199. Analysis based on oscialltors and linear/non-linear mechanisms.

It is important to observe that if the frequency of an oscillator can be changed by impulses of a different frequency, the mechanism must be non-linear.

Indeed linear only changes phase and amplitude.

A non-linear mechanism can displace frequency, usually by attraction.
Considers that the attraction is probably a long-time phenomenon, whereas the system behaves short-time like approximately linearly.

Models the brain as “a number of osciallators of frequencies of nearly 10 per second” and “within limitations these frequencies can be attracted to one another”.
As a consequence, frequencies are likely to be pulled together in certain regions of the spectrum, thus causing gaps in the spectrum. This corresponds closely to what is observed in the spectral analysis.

p200. Examines the brain for the existence and nature of the osciallators postulated.

It has been ovserved that the cerebral cortex potentials osciallate with a 1/10s frequency (before dying out) after a flash of light is delivered to the eyes.

p200. Pulling together of frequencies: evidence in nature, correspondance with the previous model.

💡 Once you have a model, apply it everywhere! Lenses. But powerful.

Applies the same thinking to show that there is also pulling together of short-term osciallations in a continuing oscillation in case of the diurnal rhythm of living beings, attracted to the 24-hour rhythm of the external environment.

Same goes with fireflies flahses, which are synchronized.
Should be modifyable by a flashing neon tube.

The phenomenon exists also in non-living beings.
Mentions electrical generating systems with parallel busbars which show a frequency regulation behavior.
But in series, this would show repulsion and thus be unstable, as previously observed.

The parallel system had a better homeoststis than the series system and therefore survived, while the series system eliminated itself by natural selection.

💡 Lenses in lenses, models in models.

p202. Relationship with the living and the brain. Generalization.

We thus see that a non-linear interaction causing the attraction of frequency can generate a self-organizing system

Mentions the problem of molecules in the mixture of amino and nucelic acids ("indifferent magma") in cells can produce definite behavior, which is unclear.
This is at the basis of the “fundamental phenomenon of life”: reproducing macromolecules.
Suggests that “the active bearer of the specificity of a modecule may lie in the frequency parttern of its molecular radiation”.
Thus, for example, a virus could emit infra-red osciallations which in turn favor the formation of other molecules of virues from the indifferent magma.

Recognises that it’s speculative. But deserves to be invesetigated, and presents a method for this: “study the absorption and emission spectra of a massive quantity of virus material”.

💡 Generalization of applicability of model. Speculative. Innovative. Creates directions for future research programs.

Compte-rendu général

Les idées que j’en ressorts:

En termes de synthèse globale du contenu du livre, et l’une des thèses principale de l’auteur: les sciences et techniques de la 2e moitié du 20e siècle vont être dominées par la communication et le contrôle. Domination? Recherches et mises en pratique, mais aussi dans le sens où ces techniques vont être en maîtrise des techniques précédentes (mécaniques, électriques, …). On peut ainsi prévoir un shift de pouvoir dans la société?
Synthèse des ouvertures que propose l’auteur sur le rapport avec l’humain, et autre thèse fondamentale de l’auteur: le cerveau (et corps) humain sera mieux (si ce n’est intégralement?) compris par l’usage des sciences cybernétiques, et ainsi nous verrons un progrès médical certain. Cette compréhension et les applications “techniques” qui s’ensuiveront (appareils, médicaments, actes médicaux / psychiatriques, etc.) sont directement issues de sciences et techniques appliquées aux machines. En comprenant la machine, l’humain scientifique développe une compréhension plus valide (mais aussi plus immédiate?) de sa propre science, compréhension qu’il pourra par analogie appliquer à l’humain lui-même.
Le raisonnement par analogie est prépondérant pour faire justement le lien entre compréhension de / application à (disons “rapport scientifico-technique à”, ou simplement “rapport à”) la machine et rapport au vivant, en particulier l’humain. Mais aussi rapport à la société (sciences sociales). Ce raisonnement par analogie est défendu comme une pierre angulaire du bon fondement du raisonnement scientifique moderne (peut-être industrialiste?) qui vise un progrès via l’application la plus sytématique possible (❌) des sciences.
Sur le plan plus personnel / pédagogique, ce type de lecture permet de construire un esprit scientifique “appliqué”. Cela permet de mieux comprendre certains résultats physiques validés et l’approche globale de leur démonstration, mais aussi leurs liens dans l’histoire des sciences (et de la pensée).