1. Abridged Problem Statement  
Given integers N, P, B and a sequence A of length N, find non-negative integers X₁,…,X_N (each in [0,P−1]) such that  
A₁·X₁ + A₂·X₂ + … + A_N·X_N ≡ B (mod P).  
Print “YES” and one solution X, or “NO” if none exists.

2. Detailed Editorial  
We wish to solve the linear congruence  
 ∑_{i=1}^N A_i X_i ≡ B  (mod P).  
Equivalently there must exist some integer Y such that  
 ∑_{i=1}^N A_i X_i + P·Y = B.  
This is a linear Diophantine equation in N+1 variables. A necessary and sufficient condition for a solution is that  
 g = gcd(A₁, A₂, …, A_N, P)  
divides B.

Moreover, if we can explicitly find integers x₁,…,x_N, y such that  
 ∑_{i=1}^N A_i x_i + P·y = g,  
then multiplying that entire equation by (B/g) gives a particular integer solution to  
 ∑ A_i·(x_i·(B/g)) + P·(y·(B/g)) = B.  
Reducing each X_i = x_i·(B/g) modulo P yields 0≤X_i<P and satisfies the original congruence.

How to compute the coefficients (x₁,…,x_N, y)? We use the extended Euclidean algorithm inductively:

1. Append P to the list of coefficients A, forming A' of length N+1.
2. Run extended-gcd on the last two elements A'_{N} and A'_{N+1}, obtaining their gcd g_{N} and Bezout coefficients for these two positions.
3. For i from N−1 down to 1, perform extended-gcd on A'_i and the current gcd g_{i+1} to get updated gcd g_i and new Bezout coefficients. We propagate the multiplier for the “old gcd” coefficient to all previously computed x_j (j>i).
4. At the end, we have coefficients x₁,…,x_N,x_{N+1} with gcd g = g₁.
5. Check if g divides B. If not, answer “NO.” Otherwise, multiply each x_i by (B/g) and reduce modulo P to get 0≤X_i<P as the solution.

Complexity: Each extended-gcd is O(log P), done N times. Updating the coefficients array takes O(N²) in the worst case. With N≤100, P≤10⁴, this is efficient.

3. Provided C++ Solution with Line-by-Line Comments  
```cpp
#include <bits/stdc++.h>
using namespace std;

// Overload output for pairs (for debugging/printing)
template<typename T1, typename T2>
ostream& operator<<(ostream& out, const pair<T1, T2>& x) {
    return out << x.first << ' ' << x.second;
}

// Overload input for pairs
template<typename T1, typename T2>
istream& operator>>(istream& in, pair<T1, T2>& x) {
    return in >> x.first >> x.second;
}

// Read a vector from input
template<typename T>
istream& operator>>(istream& in, vector<T>& a) {
    for (auto& x : a) in >> x;
    return in;
}

// Print a vector to output
template<typename T>
ostream& operator<<(ostream& out, const vector<T>& a) {
    for (auto x : a) out << x << ' ';
    return out;
}

int n, p, b;           // Global N (size), P (modulus), B (target)
vector<int> a;         // The array A

// Read input values
void read() {
    cin >> n >> p >> b;
    a.resize(n);
    cin >> a;
    // Reduce each A_i modulo p for safety
    for (auto& ai : a) ai %= p;
}

// Extended Euclidean algorithm for two numbers a and b.
// Fills x, y such that a*x + b*y = gcd(a,b).
// Returns gcd(a,b).
int64_t extended_euclid(int64_t a, int64_t b, int64_t& x, int64_t& y) {
    if (b == 0) {
        // Base case: gcd(a,0) = |a|, here a≥0
        x = 1; y = 0;
        return a;
    }
    // Recurse: compute gcd(b, a mod b)
    int64_t x1, y1;
    int64_t d = extended_euclid(b, a % b, x1, y1);
    // Back-substitute
    x = y1;
    y = x1 - y1 * (a / b);
    return d;
}

// Build Bezout coefficients for all elements of 'a' plus the appended modulus
// Returns (vector x of coeffs, gcd g) such that sum(a[i]*x[i]) = g.
pair<vector<int64_t>, int64_t> inductive_solve(const vector<int>& a) {
    int m = (int)a.size();
    vector<int64_t> x(m);
    // Start with last two elements
    int64_t g = extended_euclid(
        a[m-2], a[m-1], x[m-2], x[m-1]
    );
    // Incorporate one element at a time from right to left
    for (int i = m - 2; i-- > 0; ) {
        int64_t prev_g = g, mult;
        // gcd(a[i], prev_g) and Bezout: a[i]*x[i] + prev_g*mult = new g
        g = extended_euclid(a[i], prev_g, x[i], mult);
        // Multiply all existing coefficients x[j] (for j>i) by 'mult'
        for (int j = i + 1; j < m; ++j) {
            x[j] *= mult;
        }
    }
    return {x, g};
}

void solve() {
    // Append p so we can handle the modulus term in one Diophantine system
    a.push_back(p);
    auto [x, g] = inductive_solve(a);

    // If gcd doesn't divide b, no solution
    if (b % g != 0) {
        cout << "NO\n";
        return;
    }

    cout << "YES\n";
    // Drop the coefficient corresponding to p (last one)
    x.pop_back();

    // Scale solution by b/g and reduce modulo p
    int64_t scale = b / g;
    for (auto& xi : x) {
        xi = (xi * scale) % p;
        if (xi < 0) xi += p;
    }

    // Output the constructed X vector
    cout << x << "\n";
}

int main(){
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    // Single test case
    read();
    solve();
    return 0;
}
```

4. Python Solution with Detailed Comments  
```python
import sys
sys.setrecursionlimit(10**7)

def extended_gcd(a, b):
    """
    Returns (g, x, y) such that a*x + b*y = g = gcd(a,b).
    """
    if b == 0:
        return (a, 1, 0)
    g, x1, y1 = extended_gcd(b, a % b)
    # After recursion: b*x1 + (a%b)*y1 = g
    # => a*y1 + b*(x1 - (a//b)*y1) = g
    return (g, y1, x1 - (a // b) * y1)

def solve():
    data = sys.stdin.read().split()
    it = iter(data)
    n, p, b = map(int, (next(it), next(it), next(it)))
    a = [int(next(it)) % p for _ in range(n)]
    # Append modulus term to absorb it into the same system
    a.append(p)

    # We'll compute coefficients x[i] so that sum(a[i]*x[i]) = g
    m = n + 1
    x = [0]*m

    # Start from the last two to get their gcd
    g, x[m-2], x[m-1] = extended_gcd(a[m-2], a[m-1])
    # Inductively incorporate earlier a[i]
    for i in range(m-3, -1, -1):
        prev_g = g
        g, xi, mult = extended_gcd(a[i], prev_g)
        x[i] = xi
        # Each previous coefficient x[j>i] must be multiplied by 'mult'
        for j in range(i+1, m):
            x[j] *= mult

    # Now we have sum(a[i]*x[i]) = g. We need g | b.
    if b % g != 0:
        print("NO")
        return

    # Otherwise we can form a solution
    print("YES")
    scale = b // g
    res = []
    # Drop the last coefficient (for p) and scale the rest
    for xi in x[:-1]:
        xi = (xi * scale) % p
        res.append(str(xi))
    # Output X1..XN
    print(" ".join(res))

if __name__ == "__main__":
    solve()
```

5. Compressed Editorial  
- We need X s.t. ∑A_i X_i ≡ B (mod P).  
- Equivalently ∑A_i X_i + P·Y = B.  
- Compute g = gcd(A₁,…,A_N,P) via iterative extended-GCD, tracking Bezout coeffs.  
- If g ∤ B, answer NO. Else multiply the Bezout solution by B/g and reduce Xi mod P for a valid non-negative solution.