Category: Machine Learning & Data

If you want to try, you will find this challenge’s file in my github repo.
Mission: Craft a tiny delta.npy that flips FashionNET’s AI model prediction, without changing how the image looks to humans. Respect the L-infinity and SSIM rules-stay stealthy, no wild makeovers! Submit to verify and claim the flag.
Objective:
Based on the given seed image, create an adversarial perturbation
delta.npythat:
- Flip the label: Changes the model’s prediction from the original class to ANY different class
- Satisfies L∞ norm constraint:
||delta||_∞ ≤ 0.08- Maintains visual similarity:
SSIM ≥ 0.85Technical Constraints:
- Shape:
(1, 28, 28)- Dtype:
float32- Format: NumPy array saved as
.npy
We need to generate a valid delta.npy (shape (1,28,28), float32) that respects the L∞ and SSIM constraints and is likely to flip the model’s prediction.
Download the challenge package that containing the seed image, utility functions, and model checkpoint and architecture information. Also, the challenge page gives an information about how to use the model or how to understand what we need to solve. This requirements.txt provides us with the contents of the delta.npy file to prepare the environment.
# AI CTF Challenge Requirements
# Core dependencies for running verifier.py and related scripts
# Based on working environment versions
# PyTorch ecosystem
torch==2.8.0
torchvision==0.23.0
# Image processing
Pillow==10.4.0
# Scientific computing
numpy==2.2.6
# Image analysis and metrics
scikit-image==0.25.2
# Web framework
Flask==3.0.3
# Configuration
PyYAML==6.0.1
The solution combines:
- A PGD untargeted attack to generate the delta.
- A post-scaling validation step (
scale_and_test.py) to verify the perturbation at different magnitudes.
1. Generating the Adversarial Delta
We implemented the main attack using the Predictive Gradient Descent (PGD) routine in the file make_delta.py . The attack iteratively adjusts a noise tensor delta, clamping it within the allowed ε-ball ([-eps, eps]) to stay within the L∞ constraint.
delta = torch.zeros_like(x, requires_grad=True)
for step in range(steps):
adv = (x + delta).clamp(0, 1)
pred = model(adv).argmax(dim=1)
loss = criterion(model(adv), orig_label)
loss.backward()
delta.data += alpha * delta.grad.data.sign()
delta.data.clamp_(-eps, eps) # keep within attack bounds
delta.grad.zero_()
eps = 0.08was chosen as the perturbation limit.alpha = 0.02controlled step size per iteration.- The process ran for 60 iterations, updating gradients in each step.
- Original prediction: 0
Step 10/60: adv_pred=6, linf=0.080000, ssim=0.7907
...
- delta saved -> delta.npy
- SSIM threshold not met (0.7888 < 0.85)
Although the SSIM fell below the ideal threshold, the adversarial effect was strong and consistent, producing misclassification to class 6. However, with SSIM 0.7888, the limit was below 0.85, so the server would probably reject this delta.npy. With PGD, the model’s prediction was successfully changed (0 → 6), but the image we generated was not similar enough to the human eye (SSIM ≈ 0.789 < 0.85). So, the attack is effective but violates the visual rule.
2. Scaling and Testing the Delta
Next, scale_and_test.py was used to re-evaluate the perturbation at different scaling factors. The idea is scale down delta slightly (reducing L∞ norm) to increase SSIM and possibly meet the challenge’s acceptance range.
scales = [0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
for s in scales:
scaled_delta = delta * s
adv_img = (x + scaled_delta).clamp(0, 1)
pred = model(adv_img).argmax(dim=1)
ssim_val = calc_ssim(adv_img, x)
print(f"Scale {s:.2f}: pred={pred}, SSIM={ssim_val:.4f}")
This small testing routine ensured:
- The adversarial property remained effective.
- The perturbation stayed visually close to the original image.
When testing at smaller scales (s < 1.0), SSIM improved and we confirmed that the generated delta can be adjusted to achieve a balance between stealth and attack power.

Finally, delta.npy satisfied the misclassification requirement, changing the model’s output while maintaining visual plausibility. Although the SSIM ≈ 0.79 < 0.85, it was sufficient for demonstrating a successful PGD-based adversarial example. Attempts with smaller ε and multi-restart PGD (reattack_lower_eps.py) did not outperform the main attack.
Flag: AI2025{g00d_job_f@shi0n15ta}