<|instruction|>
Solve the below problem. The solution should start with an abridged problem statement. Then key observations. Then full solution based on the observations. Then C++ and Python implementations with comments.

140. Integer Sequences

time limit per test: 0.25 sec.
memory limit per test: 4096 KB


A sequence A is called an integer sequence of length N if all its elements A1 A2 .. AN are non-negative integers less than 2 000 000 000. Consider two integer sequences of length N, A and X. The result of their multiplication (A*X) is an integer number R=A1*X1 + A2*X2 + .. + AN*XN. Your task is to solve the equation A*X=B (mod P), given the integer sequence A and the integer numbers B and P.


Input

The first line contains the integer numbers N (1<=N<=100) - the length of the integer sequences - P (1<=P<=10 000) and B (0<=B<=P-1). The second line contains the elements of the sequence A, separated by blanks: A1 A2 .. AN.


Output

You should print one line containing the word "YES" if there exists at least one integer sequence X which is a solution to the equation, or print "NO" otherwise. If the answer is "YES", the next line should contain N non-negative integers separated by blanks: X1 X2 .. XN.


Sample Input #1

2 7 4
7 3
Sample Output #1

YES
0 6
Sample Input #2

3 10 1
2 4 6
Sample Output #2

NO
Author	: Mugurel Ionut Andreica
Resource	: SSU::Online Contester Fall Contest #2
Date	: Fall 2002

<|response|>
1. Abridged problem statement  
Given integers N (1≤N≤100), P (1≤P≤10000) and B (0≤B<P), and a sequence A of length N, find non-negative integers X₁,…,X_N (each taken modulo P) such that  
 A₁·X₁ + A₂·X₂ + … + A_N·X_N ≡ B (mod P).  
If a solution exists, print “YES” and one such X vector; otherwise print “NO.”

2. Key observations  
- The congruence  
 ∑_{i=1}^N A_i·X_i ≡ B (mod P)  
  is equivalent to the linear Diophantine equation  
 ∑ A_i·X_i + P·Y = B  
  in the unknowns X₁,…,X_N,Y (all integers, but we will reduce Xᵢ mod P at the end).  
- A necessary and sufficient condition for that Diophantine equation to have any integer solution is  
 g = gcd(A₁, A₂, …, A_N, P)  
  must divide B.  
- If we can exhibit Bézout coefficients x₁,…,x_N,y such that  
 ∑ A_i·x_i + P·y = g,  
  then multiplying everything by (B/g) gives a particular integer solution of our target equation. Reducing each xᵢ·(B/g) modulo P produces a valid non-negative Xᵢ in [0,P−1].

3. Full solution approach  
a. Preprocess  
   • Read N, P, B and the array A of length N.  
   • For safety reduce each A_i = A_i mod P; this does not change solvability mod P.  
   • Append P to the array A, making its length M = N+1.  

b. Compute gcd and Bézout coefficients for M numbers  
   We want coefficients x[0..M−1] so that  
     A[0]*x[0] + … + A[M−1]*x[M−1] = g = gcd(A[0],…,A[M−1]).  
   We do this inductively:  
   1. Base step: run extended_gcd on the last two elements A[M−2] and A[M−1], yielding g and coefficients x[M−2], x[M−1].  
   2. For i from M−3 down to 0:  
      – Let prev_g = current gcd.  
      – Run extended_gcd(A[i], prev_g), which returns new_g, and Bezout pair (xi, mult) such that  
          A[i]*xi + prev_g*mult = new_g.  
      – Update current gcd = new_g.  
      – Multiply all existing coefficients x[j] for j>i by mult, so that the overall identity still holds.  

c. Check divisibility  
   • If B % g ≠ 0, no solution → print “NO” and stop.  

d. Construct one solution  
   • Otherwise print “YES.”  
   • Discard the last Bézout coefficient (the one for P).  
   • Compute scale = B / g.  
   • For each i in [0..N−1], let  
       X_i = (x[i] * scale) mod P.  
     If X_i < 0, add P to make it non-negative.  
   • Print X₀ … X_{N−1}.  

Complexity: O(N² + N·log P), easily fits N≤100, P≤10000.

4. C++ implementation with detailed comments  
```cpp
#include <bits/stdc++.h>
using namespace std;

// Extended Euclidean algorithm
// Given a,b computes (g = gcd(a,b)) and finds x,y such that a*x + b*y = g
int64_t extended_gcd(int64_t a, int64_t b, int64_t &x, int64_t &y) {
    if (b == 0) {
        x = 1;
        y = 0;
        return a;
    }
    int64_t x1, y1;
    int64_t g = extended_gcd(b, a % b, x1, y1);
    // back-substitute
    x = y1;
    y = x1 - (a / b) * y1;
    return g;
}

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int N, P, B;
    cin >> N >> P >> B;
    vector<int> A(N);
    for (int i = 0; i < N; i++) {
        cin >> A[i];
        A[i] %= P;             // reduce mod P
        if (A[i] < 0) A[i] += P;
    }
    // Append P to absorb the modulus term
    A.push_back(P);            // now size = N+1

    int M = N + 1;
    vector<int64_t> x(M);
    // 1) Compute gcd of last two elements
    int64_t g = extended_gcd(
        A[M-2], A[M-1],
        x[M-2], x[M-1]
    );
    // 2) Inductively incorporate A[i] for i = M-3…0
    for (int i = M - 3; i >= 0; i--) {
        int64_t prev_g = g, mult;
        // solve A[i]*xi + prev_g*mult = new_g
        int64_t new_g = extended_gcd(A[i], prev_g, x[i], mult);
        g = new_g;
        // multiply all existing coefficients x[j>i] by 'mult'
        for (int j = i + 1; j < M; j++) {
            x[j] *= mult;
        }
    }

    // 3) Check if B is divisible by g
    if (B % g != 0) {
        cout << "NO\n";
        return 0;
    }

    // 4) Build one particular solution and reduce modulo P
    cout << "YES\n";
    int64_t scale = B / g;
    // We drop the last coefficient (for P itself)
    for (int i = 0; i < N; i++) {
        int64_t Xi = (x[i] * scale) % P;
        if (Xi < 0) Xi += P;
        cout << Xi << (i+1 < N ? ' ' : '\n');
    }

    return 0;
}
```

5. Python implementation with detailed comments  
```python
import sys
sys.setrecursionlimit(10**7)

def extended_gcd(a, b):
    """
    Returns (g, x, y) such that a*x + b*y = g = gcd(a,b).
    """
    if b == 0:
        return (a, 1, 0)
    g, x1, y1 = extended_gcd(b, a % b)
    # b*x1 + (a%b)*y1 = g
    # a*y1 + b*(x1 - (a//b)*y1) = g
    return (g, y1, x1 - (a // b) * y1)

def main():
    data = sys.stdin.read().split()
    it = iter(data)
    N, P, B = map(int, (next(it), next(it), next(it)))
    A = [int(next(it)) % P for _ in range(N)]
    # Append P to handle the modulus term in one system
    A.append(P)           # length M = N+1

    M = N + 1
    # x[i] will hold the current Bézout coefficient for A[i]
    x = [0] * M

    # Step 1: base gcd on the last two entries
    g, x[M-2], x[M-1] = extended_gcd(A[M-2], A[M-1])

    # Step 2: inductive incorporation from i=M-3 down to 0
    for i in range(M-3, -1, -1):
        prev_g = g
        g, xi, mult = extended_gcd(A[i], prev_g)
        x[i] = xi
        # multiply existing x[j] for j>i by 'mult'
        for j in range(i+1, M):
            x[j] *= mult

    # Step 3: check divisibility
    if B % g != 0:
        print("NO")
        return

    # Step 4: build and print solution
    print("YES")
    scale = B // g
    result = []
    for i in range(N):
        Xi = (x[i] * scale) % P
        result.append(str(Xi))
    print(" ".join(result))

if __name__ == "__main__":
    main()
```