<|instruction|>
Solve the below problem. The solution should start with an abridged problem statement. Then key observations. Then full solution based on the observations. Then C++ and Python implementations with comments.

511. Fermat's Last Theorem
Time limit per test: 0.75 second(s)
Memory limit: 262144 kilobytes
input: standard
output: standard

Given a positive integer n and a positive prime number p, find x, y and z such that xn+yn=zn modulo p and x, y and z are nonzero modulo p or report that there's no such triple.

Input
The first line of the input file contains the number t of testcases to solve, 1 ≤ t ≤ 1000. Each of the next t lines contains two integers n and p, 3 ≤ n ≤ 106, 2 ≤ p ≤ 106.

Output
For each input testcase, output one line:
when there exists a solution, output three integers x, y and z, 1 ≤ x, y, z ≤ p-1. If there are multiple solutions, output any.
when there's no solution, output one integer -1.


Example(s)
sample input
sample output
2
5 41
3 5
-1
1 2 4

<|response|>
## 1) Abridged problem statement

For each test case, given an integer \(n\ge 3\) and a **prime** \(p\), find nonzero residues
\[
x,y,z \in \{1,\dots,p-1\}
\]
such that
\[
x^n + y^n \equiv z^n \pmod p,
\]
or output `-1` if no such triple exists.

Any valid triple is accepted.

---

## 2) Key observations needed to solve the problem

1. **Nonzero residues mod prime form a cyclic group.**  
   Since \(p\) is prime, \((\mathbb{Z}/p\mathbb{Z})^\*\) is cyclic of size \(p-1\).  
   There exists a primitive root \(r\) such that every nonzero residue is \(r^k\).

2. **The set of \(n\)-th powers is a subgroup.**  
   Consider the map \(f(x)=x^n\). In exponent form:
   \[
   (r^k)^n = r^{kn}.
   \]
   Let \(g=\gcd(n,p-1)\). Then the set of \(n\)-th powers equals:
   \[
   H = \{ r^{g\cdot t} : t=0,1,\dots,\tfrac{p-1}{g}-1\},
   \]
   a subgroup of size \(|H|=(p-1)/g\), generated by \(r^g\).

3. **It’s enough to search for a solution with \(z=1\).**  
   If we find \(a,b\in H\) such that \(a+b\equiv 1\pmod p\), then
   \(a=x^n\), \(b=y^n\) for some \(x,y\), and we can output \((x,y,1)\) because \(1^n\equiv 1\).

4. **Efficiently finding \(a+b\equiv 1\) inside \(H\).**  
   Enumerate all elements of \(H\) by repeatedly multiplying by \(r^g\):
   \[
   1, r^g, (r^g)^2, \dots
   \]
   Store which ones have been seen. For each \(cur\in H\), check whether \(1-cur\) is also in \(H\) (already seen). If yes, we have the desired pair.

5. **Recovering an \(n\)-th root from subgroup exponent.**  
   If \(cur = r^{gk}\), we need \(x=r^e\) such that
   \[
   r^{en} \equiv r^{gk} \pmod p \;\Longrightarrow\; en \equiv gk \pmod{p-1}.
   \]
   Divide by \(g\):
   \[
   e\cdot (n/g) \equiv k \pmod{(p-1)/g}.
   \]
   Since \(\gcd(n/g,(p-1)/g)=1\), \((n/g)^{-1}\) exists modulo \((p-1)/g\), so:
   \[
   e \equiv k \cdot (n/g)^{-1} \pmod{(p-1)/g}.
   \]

---

## 3) Full solution approach

For each test case \((n,p)\):

1. Compute \(g=\gcd(n,p-1)\).
2. Find a primitive root \(r\) modulo \(p\).
3. Let \(gen = r^g \bmod p\). Then \(H=\langle gen\rangle\) is exactly the set of \(n\)-th powers.
4. Enumerate \(H\):
   - Start \(cur=1\).
   - Repeatedly do \(cur = cur \cdot gen \bmod p\).
   - Store `seen[cur] = step_index`.
   - For each `cur`, let `other = (1 - cur) mod p`; if `seen[other] != -1`, then:
     - `cur = r^(g*step)` and `other = r^(g*seen[other])` are \(n\)-th powers.
     - Convert both to actual bases \(x,y\) by solving the exponent equation using modular inverse.
     - Output `x y 1`.
   - If we return to `cur==1` after completing the cycle with no match, output `-1`.

Complexity per test is roughly:
- primitive root search: small (for \(p\le 10^6\)),
- subgroup enumeration: \(|H|=(p-1)/g\),
- plus a few modular exponentiations.

---

## 4) C++ implementation with detailed comments

```cpp
#include <bits/stdc++.h>
using namespace std;

static inline long long modpow(long long a, long long e, long long mod) {
    long long r = 1 % mod;
    a %= mod;
    while (e > 0) {
        if (e & 1) r = (r * a) % mod;
        a = (a * a) % mod;
        e >>= 1;
    }
    return r;
}

// Extended Euclid to compute modular inverse of a mod m, assuming gcd(a,m)=1.
static inline long long modinv(long long a, long long m) {
    long long g = m, x = 0, y = 1;
    long long r = a;
    while (r != 0) {
        long long q = g / r;
        g -= q * r; swap(g, r);
        x -= q * y; swap(x, y);
    }
    // now g = gcd(a,m) = 1, and x is the inverse up to sign
    x %= m;
    if (x < 0) x += m;
    return x;
}

// Find a primitive root modulo prime p.
// For prime p, g is primitive iff g^d != 1 for all proper divisors d of p-1.
static int primitive_root(int p) {
    int phi = p - 1;

    // Collect all divisors of phi
    vector<int> divs;
    for (int i = 1; 1LL * i * i <= phi; i++) {
        if (phi % i == 0) {
            divs.push_back(i);
            if (i * i != phi) divs.push_back(phi / i);
        }
    }
    sort(divs.begin(), divs.end()); // includes 1 and phi

    // Try candidates
    for (int g = 2; g < p; g++) {
        bool ok = true;
        // Check all proper divisors (skip the last one which is phi itself)
        for (int i = 0; i + 1 < (int)divs.size(); i++) {
            if (modpow(g, divs[i], p) == 1) {
                ok = false;
                break;
            }
        }
        if (ok) return g;
    }
    return -1; // should never happen for prime p
}

// Given that value = rt^(g*k) is an n-th power, compute x such that x^n = value.
// We solve: e*(n/g) = k (mod (p-1)/g), then x = rt^e.
static int nth_root_from_step(int k, int rt, int g, int n, int p) {
    int phi = p - 1;
    int n_div_g = n / g;
    int phi_div_g = phi / g;

    long long inv = modinv(n_div_g, phi_div_g);
    long long e = (inv * k) % phi_div_g;   // exponent in Z_(phi/g)
    return (int)modpow(rt, e, p);
}

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int t;
    cin >> t;
    while (t--) {
        int n, p;
        cin >> n >> p;

        // For p=2, note that {1..p-1} is {1}. Then x=y=z=1 gives 1^n+1^n=0 !=1.
        // So always impossible. But constraints allow p=2.
        if (p == 2) {
            cout << -1 << "\n";
            continue;
        }

        int phi = p - 1;
        int g = std::gcd(n, phi);

        int rt = primitive_root(p);
        long long gen = modpow(rt, g, p); // generator of subgroup of n-th powers

        // seen[v] = step index when v was generated in the subgroup (starting from 1 step),
        // or -1 if not seen. We use size p so we can index directly by residue.
        vector<int> seen(p, -1);

        long long cur = 1;
        int step = 0;

        while (true) {
            // Move to next subgroup element: cur = gen^step
            cur = (cur * gen) % p;
            step++;
            seen[(int)cur] = step;

            // If we returned to 1, we completed the whole subgroup without success.
            if (cur == 1) {
                cout << -1 << "\n";
                break;
            }

            int other = (1 - (int)cur) % p;
            if (other < 0) other += p;

            if (seen[other] != -1) {
                // cur + other == 1 (mod p), both are n-th powers, so:
                // x^n = cur, y^n = other, and z=1.
                int x = nth_root_from_step(step, rt, g, n, p);
                int y = nth_root_from_step(seen[other], rt, g, n, p);
                cout << x << " " << y << " " << 1 << "\n";
                break;
            }
        }
    }
    return 0;
}
```

---

## 5) Python implementation with detailed comments

```python
import sys
from math import gcd, isqrt

def modpow(a: int, e: int, mod: int) -> int:
    """Fast exponentiation: a^e mod mod."""
    a %= mod
    r = 1 % mod
    while e > 0:
        if e & 1:
            r = (r * a) % mod
        a = (a * a) % mod
        e >>= 1
    return r

def modinv(a: int, m: int) -> int:
    """Modular inverse a^{-1} mod m, assuming gcd(a,m)=1 (iterative extended Euclid)."""
    g, x, y = m, 0, 1
    r = a
    while r != 0:
        q = g // r
        g, r = r, g - q * r
        x, y = y, x - q * y
    return x % m

def primitive_root(p: int) -> int:
    """Find a primitive root modulo prime p."""
    phi = p - 1

    # list all divisors of phi
    divs = []
    for i in range(1, isqrt(phi) + 1):
        if phi % i == 0:
            divs.append(i)
            if i * i != phi:
                divs.append(phi // i)
    divs.sort()

    # g is primitive if g^d != 1 for all proper divisors d < phi
    for g in range(2, p):
        ok = True
        for d in divs[:-1]:  # skip phi itself
            if modpow(g, d, p) == 1:
                ok = False
                break
        if ok:
            return g
    return -1

def nth_root_from_step(k: int, rt: int, g: int, n: int, p: int) -> int:
    """
    We know value = rt^(g*k) is an n-th power.
    Find x such that x^n = value.

    Solve e*(n/g) = k (mod (p-1)/g), then x = rt^e.
    """
    phi = p - 1
    n_div_g = n // g
    phi_div_g = phi // g
    inv = modinv(n_div_g, phi_div_g)
    e = (inv * k) % phi_div_g
    return modpow(rt, e, p)

def solve_case(n: int, p: int) -> str:
    # Special case p=2: only residue is 1, and 1^n+1^n = 0 != 1 (mod 2)
    if p == 2:
        return "-1"

    phi = p - 1
    g = gcd(n, phi)

    rt = primitive_root(p)
    gen = modpow(rt, g, p)  # generator of subgroup of n-th powers

    seen = [-1] * p
    cur = 1
    step = 0

    while True:
        cur = (cur * gen) % p
        step += 1
        seen[cur] = step

        # Full cycle => no pair found
        if cur == 1:
            return "-1"

        other = (1 - cur) % p
        if seen[other] != -1:
            x = nth_root_from_step(step, rt, g, n, p)
            y = nth_root_from_step(seen[other], rt, g, n, p)
            return f"{x} {y} 1"

def main():
    data = sys.stdin.read().strip().split()
    t = int(data[0])
    out = []
    idx = 1
    for _ in range(t):
        n = int(data[idx]); p = int(data[idx + 1])
        idx += 2
        out.append(solve_case(n, p))
    sys.stdout.write("\n".join(out))

if __name__ == "__main__":
    main()
```

This matches the editorial’s core idea: use the cyclic structure mod prime, enumerate the subgroup of \(n\)-th powers, find two elements summing to 1, then take \(n\)-th roots to build \((x,y,1)\).