Test of PDF-extracted text of eprint 2023/1526
Thread (3 posts)
Is the definition of Neuron (jth neuron of layer i) in definition 3 standard?
Thinking Summary
The user is asking about Definition 3 in the paper, which defines a neuron as: "A neuron is a function determined by the corresponding weight matrix and activation function. Particularly, the j-th neuron of layer i is the function η given by $$\eta(x) = \sigma(A_j^{(i)}x + b_j^{(i)}),$$ where $A_j^{(i)}$ and $b_j^{(i)}$ denote, respectively, the j-th row of $A^{(i)}$ and j-th coordinate of $b^{(i)}$." The user is asking if this definition is standard. Let me think about this: 1. In standard neural network terminology, a neuron typically refers to a computational unit that: - Takes inputs - Applies weights to those inputs - Adds a bias - Applies an activation function 2. The definition here treats a neuron as a function that encapsulates all these operations. 3. The notation $A_j^{(i)}$ refers to the j-th row of the weight matrix for layer i, which represents the weights connecting all neurons from layer i-1 to the j-th neuron in layer i. This definition is mostly standard, but there are a few points worth noting: - It's common to define a neuron this way in mathematical treatments of neural networks - Some texts might separate the linear transformation from the activation function more explicitly - The notation using row vectors from the weight matrix is standard when you're thinking of the weight matrix multiplication as $Ax$ where rows of A correspond to individual neurons So yes, this is a fairly standard definition, though different texts might present it with slight variations in notation or emphasis.
Looking at Definition 3 in the paper, this is indeed a fairly standard mathematical definition of a neuron, though with some notation choices worth noting:
The definition treats a neuron as a complete function η that includes:
- The linear transformation (weights and bias):
- The activation function:
This is standard in the sense that:
- Conceptually correct: It captures the full computation performed by a single neuron
- Row vector convention: Using to denote the j-th row of the weight matrix is the standard convention when weights are organized so that rows correspond to neurons in the current layer
- Functional view: Defining neurons as functions is common in theoretical/mathematical treatments
However, you might see variations:
- Some texts separate the "pre-activation" value (before σ) from the "activation" (after σ)
- In implementation-focused contexts, "neuron" sometimes refers just to the computational unit without explicitly bundling it as a function
- Different communities might use slightly different notation (e.g., some use column vectors and transpose)
The authors actually use this functional definition quite cleverly later in the paper - for instance, Definition 5 introduces as the pre-activation value, which wouldn't make sense without first establishing the neuron as a complete function η.
So yes, this is a standard and mathematically precise way to define a neuron, particularly suitable for the cryptanalytic analysis they're performing.