Fixing SQL injection in AI-generated PHP
The five flavors of SQL injection AI assistants ship most often, what each one looks like in production code, and the canonical fix for each.
SQL injection is twenty-five years old and still ships in fresh code every week. AI assistants didn't invent the bug — they just happen to reproduce it more reliably than any junior developer ever did, because the training data is full of it. This page covers the five flavors we see most often in AI-generated PHP, why each one happens, and the canonical fix.
If you got here because a scanner flagged something in your code, jump to the section that matches the pattern. If you're learning the right way to do this from scratch, read top to bottom — the order is "most common to most subtle."
- String concatenation in the query (the classic)
- Identifier injection (placeholders can't bind these)
- LIMIT and OFFSET (same issue as identifiers)
- Dynamic
IN(?,?,?)with arrays - LIKE searches with user wildcards
1. String concatenation in the query
The default mistake. Half of the PHP tutorial corpus on the web does this, so AI models reproduce it as the most "average" way to query a database.
What it looks like:
$email = $_POST['email'];
$rows = mysqli_query($conn,
"SELECT * FROM users WHERE email = '" . $email . "'");
// or with PDO, equally broken:
$pdo->query("SELECT * FROM orders WHERE user_id = " . $_GET['uid']);
// or with sprintf, which adds nothing:
$sql = sprintf("SELECT * FROM logs WHERE level = '%s'", $_POST['level']);
$pdo->query($sql);
Every one of these is broken in the same way. An attacker who controls $email
controls the structure of the query, not just its values. Classic payload:
email = ' OR '1'='1 → SELECT * FROM users WHERE email = '' OR '1'='1' returns every user row in the database. email = '; DROP TABLE users; -- → Depending on driver settings, may execute the second statement. email = ' UNION SELECT password, NULL FROM users -- → Leaks the password column into the result set.
The fix — prepared statements with bound values:
// Plain PDO $stmt = $pdo->prepare('SELECT * FROM users WHERE email = ?'); $stmt->execute([$_POST['email']]); $rows = $stmt->fetchAll(); // Plain mysqli $stmt = $conn->prepare('SELECT * FROM users WHERE email = ?'); $stmt->bind_param('s', $_POST['email']); $stmt->execute(); $rows = $stmt->get_result()->fetch_all(MYSQLI_ASSOC); // UserSpice $db wrapper — same idea, less ceremony $rows = $db->query('SELECT * FROM users WHERE email = ?', [$_POST['email']])->results();
The query template is fixed at code-write time; the value is sent separately to the database. There is no parsing of the value as SQL. The bug becomes structurally impossible.
One trap to know about: in PHP, PDO with ATTR_EMULATE_PREPARES = true
(the default in some configurations) doesn't send a real prepared statement to the server —
it does the parameter substitution client-side, then sends a fully-formed query. This is
still safe against SQL injection, because PDO's emulation escapes correctly, but it
breaks a few other things (type handling, character-set edge cases). Set
$pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false) in your connection
bootstrap and prefer real server-side prepares.
2. Identifier injection (the trap placeholders can't fix)
Prepared statements bind values. They cannot bind identifiers — table
names, column names, or sort directions. This is a SQL standard limitation, not a PHP one.
Trying to bind an identifier with a ? placeholder just doesn't work: you get
'users' in your query (a quoted string), not users (a table
reference).
The bug shows up most often in sort columns and search column-pickers:
$col = $_GET['sort']; // "name" or "created_at" — supposedly $dir = $_GET['dir']; // "asc" or "desc" $rows = $pdo->query("SELECT * FROM users ORDER BY $col $dir");
Payload: ?sort=password sorts by the column they shouldn't even know exists.
Worse, ?sort=(SELECT password FROM users WHERE id=1) lets the attacker leak
data into the sort order one bit at a time (a blind injection variant).
The fix — explicit whitelist:
$allowedCols = ['id', 'name', 'email', 'created_at'];
$allowedDirs = ['ASC', 'DESC'];
$col = in_array($_GET['sort'] ?? '', $allowedCols, true)
? $_GET['sort']
: 'id';
$dir = in_array(strtoupper($_GET['dir'] ?? ''), $allowedDirs, true)
? strtoupper($_GET['dir'])
: 'ASC';
$rows = $pdo->query("SELECT * FROM users ORDER BY $col $dir");
The interpolation back into the SQL is now safe because $col and
$dir can only hold values that came out of your array. Untrusted input
doesn't reach the query; it's just used to pick an entry from a list you wrote.
Same pattern works for "let the user pick a search column," "join one of three optional tables," and any other variable-identifier scenario. The principle: user input picks an entry from a fixed list of safe values; never reaches the query directly.
3. LIMIT and OFFSET (the same trap, sneakier)
LIMIT and OFFSET can't be bound as placeholders either — they're
part of the query structure, not values. PDO accepts the placeholder and silently sends a
string like LIMIT '10' to MySQL, which then rejects the query as a
syntax error. People notice this, switch to interpolation, and ship the bug.
// Looks reasonable, doesn't work — PDO sends LIMIT '10' as a quoted string. $stmt = $pdo->prepare('SELECT * FROM posts LIMIT ? OFFSET ?'); $stmt->execute([$perPage, $offset]); // syntax error from MySQL // Common AI "fix" — works, but ships the bug. $pdo->query("SELECT * FROM posts LIMIT " . $_GET['per_page'] . " OFFSET " . $_GET['offset']);
The fix — cast to int, then interpolate:
$perPage = (int)($_GET['per_page'] ?? 25);
$page = max(1, (int)($_GET['page'] ?? 1));
// Optional but recommended — clamp to known-good values
$perPage = in_array($perPage, [10, 25, 50, 100], true) ? $perPage : 25;
$offset = ($page - 1) * $perPage;
$rows = $pdo->query("SELECT * FROM posts ORDER BY id DESC LIMIT $offset, $perPage")
->fetchAll();
The (int) cast is what makes the interpolation safe. A payload like
10; DROP TABLE users casts to the integer 10. Non-numeric input
becomes 0. The cast must happen before the value reaches the SQL
string — always assign to a named variable first, never do LIMIT (int)$_GET['x']
inline (easy to miss the cast next time someone edits the call site).
4. Dynamic IN(?,?,?) with arrays
Got an array of IDs and want to query "give me all rows where id is in this list"? AI assistants get this wrong in two opposite directions.
Wrong direction one — concatenating the list:
$ids = $_POST['ids']; // presumed to be an array of integers
$pdo->query("SELECT * FROM products WHERE id IN (" . implode(',', $ids) . ")");
An attacker submits ids[]=1&ids[]=2)+OR+1=1--, the array becomes
[1, '2) OR 1=1 --'], and the entire table is returned.
Wrong direction two — single placeholder for the whole array:
$stmt = $pdo->prepare('SELECT * FROM products WHERE id IN (?)');
$stmt->execute([$ids]); // just doesn't work; runs `IN ('Array')`
One placeholder binds one value. You can't bind a list to it. The query runs against the literal string "Array" (or worse, a serialization).
The fix — generate one placeholder per array element, bind them all:
$ids = array_values(array_filter($_POST['ids'], 'is_numeric')); // ↑ defense-in-depth: ensure the array only contains numerics before we even bind. if (empty($ids)) { $rows = []; // nothing to query; don't run IN () } else { $placeholders = implode(',', array_fill(0, count($ids), '?')); $stmt = $pdo->prepare("SELECT * FROM products WHERE id IN ($placeholders)"); $stmt->execute($ids); $rows = $stmt->fetchAll(); }
Each ? binds one value; the count of placeholders matches the count of array
elements; the integer-only filter at the top is belt-and-suspenders. The check for
empty($ids) matters — IN () is a syntax error in MySQL.
5. LIKE searches with user wildcards
LIKE is a parameterized value in SQL, so the placeholder works:
WHERE name LIKE ? with bound value '%alice%' is fine. The bug
isn't injection in the SQL-injection sense — it's wildcard injection. The
attacker submits a search term containing % or _ and gets behavior
you didn't intend.
// User searches for "discount %" — they get every row containing "discount "
// followed by ANY characters. Not what they typed, and possibly a data-leak channel.
$term = $_POST['q'];
$stmt = $pdo->prepare("SELECT * FROM coupons WHERE code LIKE ?");
$stmt->execute(['%' . $term . '%']);
Mostly an information-disclosure / DoS concern (a search for % alone matches
every row in the table). Sometimes a privacy issue (the search reveals the existence of
rows the user shouldn't know about). Not as critical as classic SQLi, but real.
The fix — escape the wildcards before binding:
$term = $_POST['q'];
$escaped = addcslashes($term, '%_\\'); // escape MySQL LIKE wildcards
$stmt = $pdo->prepare("SELECT * FROM coupons WHERE code LIKE ? ESCAPE '\\\\'");
$stmt->execute(['%' . $escaped . '%']);
The ESCAPE '\\' clause tells MySQL that backslash is the escape character for
the wildcards inside this LIKE. Now % in user input is treated as a literal
percent sign, not "match anything."
What people think works but doesn't
The hall of fame for unsafe "fixes" that AI assistants and old tutorials still suggest:
-
mysqli_real_escape_string()— Safe for what it covers (quoting string values inside single quotes) if the connection charset is set correctly and the value is quoted. Not safe if you forgot a quote, used it on an identifier, or rely on it to escape integers. Real prepared statements remove all of these concerns; "escape and concatenate" leaves you one typo away from a bug. -
addslashes()— Designed for a different threat model (escaping foreval()). Has known charset-based bypasses. Don't use it for SQL ever. -
Stripping single quotes / strip_tags / removing keywords — All trivially
bypassable. Attackers don't need
SELECTif they haveSeLeCt, and they don't need single quotes if they have0x...hex literals. -
"My ORM handles it" — Most ORMs do for query-builder calls. None
of them do for raw query escapes like
DB::raw(),orderByRaw(), or string concatenation inside a where-clause method. Audit every raw call. - WAFs and "we have Cloudflare" — WAFs are a layer, not a fix. They catch the noisy attempts; they miss the targeted ones. Don't ship known-vulnerable code on the assumption that a generic ruleset will catch the attacker.
Tooling that catches it before you ship
All five flavors above are catchable by static analysis. The tools to point at the codebase:
-
Semgrep — Custom rules can flag
$pdo->query(),$db->query(), and friends when an argument contains string concatenation with a variable. The Security Scanner ships UserSpice-aware Semgrep rules out of the box. -
Psalm taint analysis — Marks
$_GET/$_POSTas tainted sources and traces them through your code to query sinks. Catches the cross-function cases Semgrep misses. -
/userspice-auditRule 04 — Walks every PHP file outsideusers/and flags concatenation in$db->query()calls with file:line. Runs as a Claude Code skill, no Docker needed.
Retrofitting an existing AI-coded project
One non-obvious thing if you're cleaning up a project that already has data in production:
don't naively swap raw $_POST reads for sanitizing helpers without checking the
display path.
On UserSpice (and any framework with a "sanitize on input" helper), if you start running
Input::get() on a save handler that previously took raw $_POST,
new rows get htmlspecialchars-encoded once on save. Old rows are still raw. If
the display path also gets safeReturn(), new rows render double-encoded
(Tom&Jerry instead of Tom&Jerry). Either escape at
output only (safe retrofit), or migrate the existing data alongside the code change.
The SQL injection fix itself doesn't have this problem — switching from concatenation to prepared statements is data-safe in both directions. But it's worth flagging that "security retrofit" and "data migration" are different problems, and the bigger the existing codebase, the more carefully you want to think about which one you're solving.
Want this done for you?
Fixing SQLi across a real codebase is mechanically simple but tedious. If you'd rather not do the grep-and-replace yourself, the form below goes straight to us — paste a repo URL or the specific files that came back flagged and we'll do the audit + fix.