Can SQL Server create collisions in system generated constraint names?With MS SQL Server are the generated constraint names predictable?Setup error :Attempted to perform an unauthorized operationCan I create a constraint to a subset of data?Query Plan ErrorOptimizer gets the wrong number of estimated rows even with updated statsCan SQL Server system tables be defragmented?Error while running stored Proc sp_BlitzIndex (TM) v4.2 - September 03, 2016Scalar Operator in Seek Predicate and a Row Estimate of 1SQL Server LCK_M_X lock from .NET applicationWhy is selecting all resulting columns of this query faster than selecting the one column I care about?
Constructions of PRF (Pseudo Random Function)
How can I get this effect? Please see the attached image
'It addicted me, with one taste.' Can 'addict' be used transitively?
Why did C use the -> operator instead of reusing the . operator?
What makes accurate emulation of old systems a difficult task?
Is it idiomatic to construct against `this`
How to not starve gigantic beasts
Which big number is bigger?
Checks user level and limit the data before saving it to mongoDB
Multiple options vs single option UI
Are there physical dangers to preparing a prepared piano?
How did Captain America manage to do this?
Coordinate my way to the name of the (video) game
Elements that can bond to themselves?
Two field separators (colon and space) in awk
How to stop co-workers from teasing me because I know Russian?
Is the claim "Employers won't employ people with no 'social media presence'" realistic?
How to limit Drive Letters Windows assigns to new removable USB drives
What happens to Mjolnir (Thor's hammer) at the end of Endgame?
a sore throat vs a strep throat vs strep throat
Pre-plastic human skin alternative
"The cow" OR "a cow" OR "cows" in this context
What is the smallest unit of eos?
Why was the Spitfire's elliptical wing almost uncopied by other aircraft of World War 2?
Can SQL Server create collisions in system generated constraint names?
With MS SQL Server are the generated constraint names predictable?Setup error :Attempted to perform an unauthorized operationCan I create a constraint to a subset of data?Query Plan ErrorOptimizer gets the wrong number of estimated rows even with updated statsCan SQL Server system tables be defragmented?Error while running stored Proc sp_BlitzIndex (TM) v4.2 - September 03, 2016Scalar Operator in Seek Predicate and a Row Estimate of 1SQL Server LCK_M_X lock from .NET applicationWhy is selecting all resulting columns of this query faster than selecting the one column I care about?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I have an application which creates millions of tables in a SQL Server 2008 database (non clustered). I am looking to upgrade to SQL Server 2014 (clustered), but am hitting an error message when under load:
“There is already an object named ‘PK__tablenameprefix__179E2ED8F259C33B’ in the database”
This is a system generated constraint name. It looks like a randomly generated 64-bit number. Is it possible that I am seeing collisions due to the large number of tables? Assuming I have 100 million tables, I calculate less than a 1-in-1-trillion chance of a collision, but that assumes a uniform distribution. Is it possible that SQL Server changed its name generation algorithm between version 2008 and 2014 to increase the odds of collision?
The other significant difference is that my 2014 instance is a clustered pair, but I am struggling to form a hypothesis for why that would generate the above error.
P.S. Yes, I know creating millions of tables is insane. This is black box 3rd party code over which I have no control. Despite the insanity, it worked in version 2008 and now doesn’t in version 2014.
Edit: on closer inspection, the generated suffix always seems to start with 179E2ED8 - meaning the random part is actually only a 32-bit number and the odds of collisions are a mere 1-in-50, which is a much closer match to the error rate I’m seeing!
sql-server sql-server-2008 sql-server-2014 constraint
|
show 1 more comment
I have an application which creates millions of tables in a SQL Server 2008 database (non clustered). I am looking to upgrade to SQL Server 2014 (clustered), but am hitting an error message when under load:
“There is already an object named ‘PK__tablenameprefix__179E2ED8F259C33B’ in the database”
This is a system generated constraint name. It looks like a randomly generated 64-bit number. Is it possible that I am seeing collisions due to the large number of tables? Assuming I have 100 million tables, I calculate less than a 1-in-1-trillion chance of a collision, but that assumes a uniform distribution. Is it possible that SQL Server changed its name generation algorithm between version 2008 and 2014 to increase the odds of collision?
The other significant difference is that my 2014 instance is a clustered pair, but I am struggling to form a hypothesis for why that would generate the above error.
P.S. Yes, I know creating millions of tables is insane. This is black box 3rd party code over which I have no control. Despite the insanity, it worked in version 2008 and now doesn’t in version 2014.
Edit: on closer inspection, the generated suffix always seems to start with 179E2ED8 - meaning the random part is actually only a 32-bit number and the odds of collisions are a mere 1-in-50, which is a much closer match to the error rate I’m seeing!
sql-server sql-server-2008 sql-server-2014 constraint
Loving the question! But wouldn't the tablenameprefix be different for each table?
– Randi Vertongen
12 hours ago
The table names are different but they use a naming convention which results in at least the first 11 characters being the same, and that seems to be all SQL Server uses in generating the constraint name.
– jl6
12 hours ago
Is the underlying hardware the same (number of cores and speed)? It could be that greater throughput increases the likelihood rather than a change in the algorithm.
– Dan Guzman
11 hours ago
The underlying hardware is different (newer generation of DL380) but not significantly higher performance. The aim of the exercise is to replace the out of support SQL Server 2008, not to improve throughput, and the hardware has been provisioned accordingly.
– jl6
11 hours ago
@Martin Smith - I’ve done some testing which seems to suggest that the first 8 bytes of the hex in the constraint name depend only on the name of the primary key column. The last 8 bytes look random.
– jl6
10 hours ago
|
show 1 more comment
I have an application which creates millions of tables in a SQL Server 2008 database (non clustered). I am looking to upgrade to SQL Server 2014 (clustered), but am hitting an error message when under load:
“There is already an object named ‘PK__tablenameprefix__179E2ED8F259C33B’ in the database”
This is a system generated constraint name. It looks like a randomly generated 64-bit number. Is it possible that I am seeing collisions due to the large number of tables? Assuming I have 100 million tables, I calculate less than a 1-in-1-trillion chance of a collision, but that assumes a uniform distribution. Is it possible that SQL Server changed its name generation algorithm between version 2008 and 2014 to increase the odds of collision?
The other significant difference is that my 2014 instance is a clustered pair, but I am struggling to form a hypothesis for why that would generate the above error.
P.S. Yes, I know creating millions of tables is insane. This is black box 3rd party code over which I have no control. Despite the insanity, it worked in version 2008 and now doesn’t in version 2014.
Edit: on closer inspection, the generated suffix always seems to start with 179E2ED8 - meaning the random part is actually only a 32-bit number and the odds of collisions are a mere 1-in-50, which is a much closer match to the error rate I’m seeing!
sql-server sql-server-2008 sql-server-2014 constraint
I have an application which creates millions of tables in a SQL Server 2008 database (non clustered). I am looking to upgrade to SQL Server 2014 (clustered), but am hitting an error message when under load:
“There is already an object named ‘PK__tablenameprefix__179E2ED8F259C33B’ in the database”
This is a system generated constraint name. It looks like a randomly generated 64-bit number. Is it possible that I am seeing collisions due to the large number of tables? Assuming I have 100 million tables, I calculate less than a 1-in-1-trillion chance of a collision, but that assumes a uniform distribution. Is it possible that SQL Server changed its name generation algorithm between version 2008 and 2014 to increase the odds of collision?
The other significant difference is that my 2014 instance is a clustered pair, but I am struggling to form a hypothesis for why that would generate the above error.
P.S. Yes, I know creating millions of tables is insane. This is black box 3rd party code over which I have no control. Despite the insanity, it worked in version 2008 and now doesn’t in version 2014.
Edit: on closer inspection, the generated suffix always seems to start with 179E2ED8 - meaning the random part is actually only a 32-bit number and the odds of collisions are a mere 1-in-50, which is a much closer match to the error rate I’m seeing!
sql-server sql-server-2008 sql-server-2014 constraint
sql-server sql-server-2008 sql-server-2014 constraint
edited 9 hours ago
jl6
asked 12 hours ago
jl6jl6
397311
397311
Loving the question! But wouldn't the tablenameprefix be different for each table?
– Randi Vertongen
12 hours ago
The table names are different but they use a naming convention which results in at least the first 11 characters being the same, and that seems to be all SQL Server uses in generating the constraint name.
– jl6
12 hours ago
Is the underlying hardware the same (number of cores and speed)? It could be that greater throughput increases the likelihood rather than a change in the algorithm.
– Dan Guzman
11 hours ago
The underlying hardware is different (newer generation of DL380) but not significantly higher performance. The aim of the exercise is to replace the out of support SQL Server 2008, not to improve throughput, and the hardware has been provisioned accordingly.
– jl6
11 hours ago
@Martin Smith - I’ve done some testing which seems to suggest that the first 8 bytes of the hex in the constraint name depend only on the name of the primary key column. The last 8 bytes look random.
– jl6
10 hours ago
|
show 1 more comment
Loving the question! But wouldn't the tablenameprefix be different for each table?
– Randi Vertongen
12 hours ago
The table names are different but they use a naming convention which results in at least the first 11 characters being the same, and that seems to be all SQL Server uses in generating the constraint name.
– jl6
12 hours ago
Is the underlying hardware the same (number of cores and speed)? It could be that greater throughput increases the likelihood rather than a change in the algorithm.
– Dan Guzman
11 hours ago
The underlying hardware is different (newer generation of DL380) but not significantly higher performance. The aim of the exercise is to replace the out of support SQL Server 2008, not to improve throughput, and the hardware has been provisioned accordingly.
– jl6
11 hours ago
@Martin Smith - I’ve done some testing which seems to suggest that the first 8 bytes of the hex in the constraint name depend only on the name of the primary key column. The last 8 bytes look random.
– jl6
10 hours ago
Loving the question! But wouldn't the tablenameprefix be different for each table?
– Randi Vertongen
12 hours ago
Loving the question! But wouldn't the tablenameprefix be different for each table?
– Randi Vertongen
12 hours ago
The table names are different but they use a naming convention which results in at least the first 11 characters being the same, and that seems to be all SQL Server uses in generating the constraint name.
– jl6
12 hours ago
The table names are different but they use a naming convention which results in at least the first 11 characters being the same, and that seems to be all SQL Server uses in generating the constraint name.
– jl6
12 hours ago
Is the underlying hardware the same (number of cores and speed)? It could be that greater throughput increases the likelihood rather than a change in the algorithm.
– Dan Guzman
11 hours ago
Is the underlying hardware the same (number of cores and speed)? It could be that greater throughput increases the likelihood rather than a change in the algorithm.
– Dan Guzman
11 hours ago
The underlying hardware is different (newer generation of DL380) but not significantly higher performance. The aim of the exercise is to replace the out of support SQL Server 2008, not to improve throughput, and the hardware has been provisioned accordingly.
– jl6
11 hours ago
The underlying hardware is different (newer generation of DL380) but not significantly higher performance. The aim of the exercise is to replace the out of support SQL Server 2008, not to improve throughput, and the hardware has been provisioned accordingly.
– jl6
11 hours ago
@Martin Smith - I’ve done some testing which seems to suggest that the first 8 bytes of the hex in the constraint name depend only on the name of the primary key column. The last 8 bytes look random.
– jl6
10 hours ago
@Martin Smith - I’ve done some testing which seems to suggest that the first 8 bytes of the hex in the constraint name depend only on the name of the primary key column. The last 8 bytes look random.
– jl6
10 hours ago
|
show 1 more comment
2 Answers
2
active
oldest
votes
Can SQL Server create collisions in system generated constraint names?
This depends on the type of constraint and version of SQL Server.
CREATE TABLE T1
(
A INT PRIMARY KEY CHECK (A > 0),
B INT DEFAULT -1 REFERENCES T1,
C INT UNIQUE,
CHECK (C > A)
)
SELECT name,
object_id,
CAST(object_id AS binary(4)) as object_id_hex,
CAST(CASE WHEN object_id >= 16000057 THEN object_id -16000057 ELSE object_id +2131483591 END AS BINARY(4)) AS object_id_offset_hex
FROM sys.objects
WHERE parent_object_id = OBJECT_ID('T1')
ORDER BY name;
drop table T1
Example Results 2008
+--------------------------+-----------+---------------+----------------------+
| name | object_id | object_id_hex | object_id_offset_hex |
+--------------------------+-----------+---------------+----------------------+
| CK__T1__1D498357 | 491357015 | 0x1D498357 | 0x1C555F1E |
| CK__T1__A__1A6D16AC | 443356844 | 0x1A6D16AC | 0x1978F273 |
| DF__T1__B__1B613AE5 | 459356901 | 0x1B613AE5 | 0x1A6D16AC |
| FK__T1__B__1C555F1E | 475356958 | 0x1C555F1E | 0x1B613AE5 |
| PK__T1__3BD019AE15A8618F | 379356616 | 0x169C85C8 | 0x15A8618F |
| UQ__T1__3BD019A91884CE3A | 427356787 | 0x1978F273 | 0x1884CE3A |
+--------------------------+-----------+---------------+----------------------+
Example Results 2017
+--------------------------+------------+---------------+----------------------+
| name | object_id | object_id_hex | object_id_offset_hex |
+--------------------------+------------+---------------+----------------------+
| CK__T1__59FA5E80 | 1509580416 | 0x59FA5E80 | 0x59063A47 |
| CK__T1__A__571DF1D5 | 1461580245 | 0x571DF1D5 | 0x5629CD9C |
| DF__T1__B__5812160E | 1477580302 | 0x5812160E | 0x571DF1D5 |
| FK__T1__B__59063A47 | 1493580359 | 0x59063A47 | 0x5812160E |
| PK__T1__3BD019AE0A4A6932 | 1429580131 | 0x5535A963 | 0x5441852A |
| UQ__T1__3BD019A981F522E0 | 1445580188 | 0x5629CD9C | 0x5535A963 |
+--------------------------+------------+---------------+----------------------+
For default constraints, check constraints and foreign key constraints the last 4 bytes of the auto generated name are a hexadecimal version of the objectid of the constraint. As objectid
are guaranteed unique the name must also be unique. In Sybase too these use tabname_colname_objectid
For unique constraints and primary key constraints Sybase uses
tabname_colname_tabindid, where tabindid is a string concatenation of
the table ID and index ID
This too would guarantee uniqueness.
SQL Server doesn't use this scheme.
In both SQL Server 2008 and 2017 it uses an 8 byte string at the end of the system generated name however the algorithm has changed as to how the last 4 bytes of that are generated.
In 2008 the last 4 bytes represent a signed integer counter that is offset from the object_id
by -16000057
with any negative value wrapping around to max signed int. (The significance of 16000057
is that this is the increment applied between successively created object_id
)
On 2012 upwards I don't see any pattern at all between the object_id of the constraint and the integer obtained by treating the last 8 characters of the name as the hexadecimal representation of a signed int.
The function names in the call stack in 2017 shows that it now creates a GUID as part of the name generation process (On 2008 I see no mention of MDConstraintNameGenerator
). I guess this is to provide some source of randomness. Clearly it isn't using the whole 16 bytes from the GUID in that 4 bytes that changes between constraints however.
I presume the new algorithm was done for some efficiency reason at the expense of some increased possibility of collisions in extreme cases such as yours.
add a comment |
Assuming I have 100 million tables, I calculate less than a 1-in-1-trillion chance of a collision
Remember this is the "birthday problem". You're not trying to generate a collision for a single given hash, but rather measuring the probability that none of the many pairs of values will collide.
So with N tables, there are N*(N-1)/2 pairs, so here about 10^16 pairs. If the probability of a collision is 2^-64, the probability of a single pair not colliding is (1-2^-64), but with so many pairs, the probability of having no collisions here is about (1-2^-64)^(10^16), or more like 1/10,000. See eg https://preshing.com/20110504/hash-collision-probabilities/
And if it's only a 32bit hash the probablity of a collision crosses 1/2 at only 77k values.
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "182"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f236768%2fcan-sql-server-create-collisions-in-system-generated-constraint-names%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Can SQL Server create collisions in system generated constraint names?
This depends on the type of constraint and version of SQL Server.
CREATE TABLE T1
(
A INT PRIMARY KEY CHECK (A > 0),
B INT DEFAULT -1 REFERENCES T1,
C INT UNIQUE,
CHECK (C > A)
)
SELECT name,
object_id,
CAST(object_id AS binary(4)) as object_id_hex,
CAST(CASE WHEN object_id >= 16000057 THEN object_id -16000057 ELSE object_id +2131483591 END AS BINARY(4)) AS object_id_offset_hex
FROM sys.objects
WHERE parent_object_id = OBJECT_ID('T1')
ORDER BY name;
drop table T1
Example Results 2008
+--------------------------+-----------+---------------+----------------------+
| name | object_id | object_id_hex | object_id_offset_hex |
+--------------------------+-----------+---------------+----------------------+
| CK__T1__1D498357 | 491357015 | 0x1D498357 | 0x1C555F1E |
| CK__T1__A__1A6D16AC | 443356844 | 0x1A6D16AC | 0x1978F273 |
| DF__T1__B__1B613AE5 | 459356901 | 0x1B613AE5 | 0x1A6D16AC |
| FK__T1__B__1C555F1E | 475356958 | 0x1C555F1E | 0x1B613AE5 |
| PK__T1__3BD019AE15A8618F | 379356616 | 0x169C85C8 | 0x15A8618F |
| UQ__T1__3BD019A91884CE3A | 427356787 | 0x1978F273 | 0x1884CE3A |
+--------------------------+-----------+---------------+----------------------+
Example Results 2017
+--------------------------+------------+---------------+----------------------+
| name | object_id | object_id_hex | object_id_offset_hex |
+--------------------------+------------+---------------+----------------------+
| CK__T1__59FA5E80 | 1509580416 | 0x59FA5E80 | 0x59063A47 |
| CK__T1__A__571DF1D5 | 1461580245 | 0x571DF1D5 | 0x5629CD9C |
| DF__T1__B__5812160E | 1477580302 | 0x5812160E | 0x571DF1D5 |
| FK__T1__B__59063A47 | 1493580359 | 0x59063A47 | 0x5812160E |
| PK__T1__3BD019AE0A4A6932 | 1429580131 | 0x5535A963 | 0x5441852A |
| UQ__T1__3BD019A981F522E0 | 1445580188 | 0x5629CD9C | 0x5535A963 |
+--------------------------+------------+---------------+----------------------+
For default constraints, check constraints and foreign key constraints the last 4 bytes of the auto generated name are a hexadecimal version of the objectid of the constraint. As objectid
are guaranteed unique the name must also be unique. In Sybase too these use tabname_colname_objectid
For unique constraints and primary key constraints Sybase uses
tabname_colname_tabindid, where tabindid is a string concatenation of
the table ID and index ID
This too would guarantee uniqueness.
SQL Server doesn't use this scheme.
In both SQL Server 2008 and 2017 it uses an 8 byte string at the end of the system generated name however the algorithm has changed as to how the last 4 bytes of that are generated.
In 2008 the last 4 bytes represent a signed integer counter that is offset from the object_id
by -16000057
with any negative value wrapping around to max signed int. (The significance of 16000057
is that this is the increment applied between successively created object_id
)
On 2012 upwards I don't see any pattern at all between the object_id of the constraint and the integer obtained by treating the last 8 characters of the name as the hexadecimal representation of a signed int.
The function names in the call stack in 2017 shows that it now creates a GUID as part of the name generation process (On 2008 I see no mention of MDConstraintNameGenerator
). I guess this is to provide some source of randomness. Clearly it isn't using the whole 16 bytes from the GUID in that 4 bytes that changes between constraints however.
I presume the new algorithm was done for some efficiency reason at the expense of some increased possibility of collisions in extreme cases such as yours.
add a comment |
Can SQL Server create collisions in system generated constraint names?
This depends on the type of constraint and version of SQL Server.
CREATE TABLE T1
(
A INT PRIMARY KEY CHECK (A > 0),
B INT DEFAULT -1 REFERENCES T1,
C INT UNIQUE,
CHECK (C > A)
)
SELECT name,
object_id,
CAST(object_id AS binary(4)) as object_id_hex,
CAST(CASE WHEN object_id >= 16000057 THEN object_id -16000057 ELSE object_id +2131483591 END AS BINARY(4)) AS object_id_offset_hex
FROM sys.objects
WHERE parent_object_id = OBJECT_ID('T1')
ORDER BY name;
drop table T1
Example Results 2008
+--------------------------+-----------+---------------+----------------------+
| name | object_id | object_id_hex | object_id_offset_hex |
+--------------------------+-----------+---------------+----------------------+
| CK__T1__1D498357 | 491357015 | 0x1D498357 | 0x1C555F1E |
| CK__T1__A__1A6D16AC | 443356844 | 0x1A6D16AC | 0x1978F273 |
| DF__T1__B__1B613AE5 | 459356901 | 0x1B613AE5 | 0x1A6D16AC |
| FK__T1__B__1C555F1E | 475356958 | 0x1C555F1E | 0x1B613AE5 |
| PK__T1__3BD019AE15A8618F | 379356616 | 0x169C85C8 | 0x15A8618F |
| UQ__T1__3BD019A91884CE3A | 427356787 | 0x1978F273 | 0x1884CE3A |
+--------------------------+-----------+---------------+----------------------+
Example Results 2017
+--------------------------+------------+---------------+----------------------+
| name | object_id | object_id_hex | object_id_offset_hex |
+--------------------------+------------+---------------+----------------------+
| CK__T1__59FA5E80 | 1509580416 | 0x59FA5E80 | 0x59063A47 |
| CK__T1__A__571DF1D5 | 1461580245 | 0x571DF1D5 | 0x5629CD9C |
| DF__T1__B__5812160E | 1477580302 | 0x5812160E | 0x571DF1D5 |
| FK__T1__B__59063A47 | 1493580359 | 0x59063A47 | 0x5812160E |
| PK__T1__3BD019AE0A4A6932 | 1429580131 | 0x5535A963 | 0x5441852A |
| UQ__T1__3BD019A981F522E0 | 1445580188 | 0x5629CD9C | 0x5535A963 |
+--------------------------+------------+---------------+----------------------+
For default constraints, check constraints and foreign key constraints the last 4 bytes of the auto generated name are a hexadecimal version of the objectid of the constraint. As objectid
are guaranteed unique the name must also be unique. In Sybase too these use tabname_colname_objectid
For unique constraints and primary key constraints Sybase uses
tabname_colname_tabindid, where tabindid is a string concatenation of
the table ID and index ID
This too would guarantee uniqueness.
SQL Server doesn't use this scheme.
In both SQL Server 2008 and 2017 it uses an 8 byte string at the end of the system generated name however the algorithm has changed as to how the last 4 bytes of that are generated.
In 2008 the last 4 bytes represent a signed integer counter that is offset from the object_id
by -16000057
with any negative value wrapping around to max signed int. (The significance of 16000057
is that this is the increment applied between successively created object_id
)
On 2012 upwards I don't see any pattern at all between the object_id of the constraint and the integer obtained by treating the last 8 characters of the name as the hexadecimal representation of a signed int.
The function names in the call stack in 2017 shows that it now creates a GUID as part of the name generation process (On 2008 I see no mention of MDConstraintNameGenerator
). I guess this is to provide some source of randomness. Clearly it isn't using the whole 16 bytes from the GUID in that 4 bytes that changes between constraints however.
I presume the new algorithm was done for some efficiency reason at the expense of some increased possibility of collisions in extreme cases such as yours.
add a comment |
Can SQL Server create collisions in system generated constraint names?
This depends on the type of constraint and version of SQL Server.
CREATE TABLE T1
(
A INT PRIMARY KEY CHECK (A > 0),
B INT DEFAULT -1 REFERENCES T1,
C INT UNIQUE,
CHECK (C > A)
)
SELECT name,
object_id,
CAST(object_id AS binary(4)) as object_id_hex,
CAST(CASE WHEN object_id >= 16000057 THEN object_id -16000057 ELSE object_id +2131483591 END AS BINARY(4)) AS object_id_offset_hex
FROM sys.objects
WHERE parent_object_id = OBJECT_ID('T1')
ORDER BY name;
drop table T1
Example Results 2008
+--------------------------+-----------+---------------+----------------------+
| name | object_id | object_id_hex | object_id_offset_hex |
+--------------------------+-----------+---------------+----------------------+
| CK__T1__1D498357 | 491357015 | 0x1D498357 | 0x1C555F1E |
| CK__T1__A__1A6D16AC | 443356844 | 0x1A6D16AC | 0x1978F273 |
| DF__T1__B__1B613AE5 | 459356901 | 0x1B613AE5 | 0x1A6D16AC |
| FK__T1__B__1C555F1E | 475356958 | 0x1C555F1E | 0x1B613AE5 |
| PK__T1__3BD019AE15A8618F | 379356616 | 0x169C85C8 | 0x15A8618F |
| UQ__T1__3BD019A91884CE3A | 427356787 | 0x1978F273 | 0x1884CE3A |
+--------------------------+-----------+---------------+----------------------+
Example Results 2017
+--------------------------+------------+---------------+----------------------+
| name | object_id | object_id_hex | object_id_offset_hex |
+--------------------------+------------+---------------+----------------------+
| CK__T1__59FA5E80 | 1509580416 | 0x59FA5E80 | 0x59063A47 |
| CK__T1__A__571DF1D5 | 1461580245 | 0x571DF1D5 | 0x5629CD9C |
| DF__T1__B__5812160E | 1477580302 | 0x5812160E | 0x571DF1D5 |
| FK__T1__B__59063A47 | 1493580359 | 0x59063A47 | 0x5812160E |
| PK__T1__3BD019AE0A4A6932 | 1429580131 | 0x5535A963 | 0x5441852A |
| UQ__T1__3BD019A981F522E0 | 1445580188 | 0x5629CD9C | 0x5535A963 |
+--------------------------+------------+---------------+----------------------+
For default constraints, check constraints and foreign key constraints the last 4 bytes of the auto generated name are a hexadecimal version of the objectid of the constraint. As objectid
are guaranteed unique the name must also be unique. In Sybase too these use tabname_colname_objectid
For unique constraints and primary key constraints Sybase uses
tabname_colname_tabindid, where tabindid is a string concatenation of
the table ID and index ID
This too would guarantee uniqueness.
SQL Server doesn't use this scheme.
In both SQL Server 2008 and 2017 it uses an 8 byte string at the end of the system generated name however the algorithm has changed as to how the last 4 bytes of that are generated.
In 2008 the last 4 bytes represent a signed integer counter that is offset from the object_id
by -16000057
with any negative value wrapping around to max signed int. (The significance of 16000057
is that this is the increment applied between successively created object_id
)
On 2012 upwards I don't see any pattern at all between the object_id of the constraint and the integer obtained by treating the last 8 characters of the name as the hexadecimal representation of a signed int.
The function names in the call stack in 2017 shows that it now creates a GUID as part of the name generation process (On 2008 I see no mention of MDConstraintNameGenerator
). I guess this is to provide some source of randomness. Clearly it isn't using the whole 16 bytes from the GUID in that 4 bytes that changes between constraints however.
I presume the new algorithm was done for some efficiency reason at the expense of some increased possibility of collisions in extreme cases such as yours.
Can SQL Server create collisions in system generated constraint names?
This depends on the type of constraint and version of SQL Server.
CREATE TABLE T1
(
A INT PRIMARY KEY CHECK (A > 0),
B INT DEFAULT -1 REFERENCES T1,
C INT UNIQUE,
CHECK (C > A)
)
SELECT name,
object_id,
CAST(object_id AS binary(4)) as object_id_hex,
CAST(CASE WHEN object_id >= 16000057 THEN object_id -16000057 ELSE object_id +2131483591 END AS BINARY(4)) AS object_id_offset_hex
FROM sys.objects
WHERE parent_object_id = OBJECT_ID('T1')
ORDER BY name;
drop table T1
Example Results 2008
+--------------------------+-----------+---------------+----------------------+
| name | object_id | object_id_hex | object_id_offset_hex |
+--------------------------+-----------+---------------+----------------------+
| CK__T1__1D498357 | 491357015 | 0x1D498357 | 0x1C555F1E |
| CK__T1__A__1A6D16AC | 443356844 | 0x1A6D16AC | 0x1978F273 |
| DF__T1__B__1B613AE5 | 459356901 | 0x1B613AE5 | 0x1A6D16AC |
| FK__T1__B__1C555F1E | 475356958 | 0x1C555F1E | 0x1B613AE5 |
| PK__T1__3BD019AE15A8618F | 379356616 | 0x169C85C8 | 0x15A8618F |
| UQ__T1__3BD019A91884CE3A | 427356787 | 0x1978F273 | 0x1884CE3A |
+--------------------------+-----------+---------------+----------------------+
Example Results 2017
+--------------------------+------------+---------------+----------------------+
| name | object_id | object_id_hex | object_id_offset_hex |
+--------------------------+------------+---------------+----------------------+
| CK__T1__59FA5E80 | 1509580416 | 0x59FA5E80 | 0x59063A47 |
| CK__T1__A__571DF1D5 | 1461580245 | 0x571DF1D5 | 0x5629CD9C |
| DF__T1__B__5812160E | 1477580302 | 0x5812160E | 0x571DF1D5 |
| FK__T1__B__59063A47 | 1493580359 | 0x59063A47 | 0x5812160E |
| PK__T1__3BD019AE0A4A6932 | 1429580131 | 0x5535A963 | 0x5441852A |
| UQ__T1__3BD019A981F522E0 | 1445580188 | 0x5629CD9C | 0x5535A963 |
+--------------------------+------------+---------------+----------------------+
For default constraints, check constraints and foreign key constraints the last 4 bytes of the auto generated name are a hexadecimal version of the objectid of the constraint. As objectid
are guaranteed unique the name must also be unique. In Sybase too these use tabname_colname_objectid
For unique constraints and primary key constraints Sybase uses
tabname_colname_tabindid, where tabindid is a string concatenation of
the table ID and index ID
This too would guarantee uniqueness.
SQL Server doesn't use this scheme.
In both SQL Server 2008 and 2017 it uses an 8 byte string at the end of the system generated name however the algorithm has changed as to how the last 4 bytes of that are generated.
In 2008 the last 4 bytes represent a signed integer counter that is offset from the object_id
by -16000057
with any negative value wrapping around to max signed int. (The significance of 16000057
is that this is the increment applied between successively created object_id
)
On 2012 upwards I don't see any pattern at all between the object_id of the constraint and the integer obtained by treating the last 8 characters of the name as the hexadecimal representation of a signed int.
The function names in the call stack in 2017 shows that it now creates a GUID as part of the name generation process (On 2008 I see no mention of MDConstraintNameGenerator
). I guess this is to provide some source of randomness. Clearly it isn't using the whole 16 bytes from the GUID in that 4 bytes that changes between constraints however.
I presume the new algorithm was done for some efficiency reason at the expense of some increased possibility of collisions in extreme cases such as yours.
edited 32 mins ago
answered 7 hours ago
Martin SmithMartin Smith
64.9k10175261
64.9k10175261
add a comment |
add a comment |
Assuming I have 100 million tables, I calculate less than a 1-in-1-trillion chance of a collision
Remember this is the "birthday problem". You're not trying to generate a collision for a single given hash, but rather measuring the probability that none of the many pairs of values will collide.
So with N tables, there are N*(N-1)/2 pairs, so here about 10^16 pairs. If the probability of a collision is 2^-64, the probability of a single pair not colliding is (1-2^-64), but with so many pairs, the probability of having no collisions here is about (1-2^-64)^(10^16), or more like 1/10,000. See eg https://preshing.com/20110504/hash-collision-probabilities/
And if it's only a 32bit hash the probablity of a collision crosses 1/2 at only 77k values.
add a comment |
Assuming I have 100 million tables, I calculate less than a 1-in-1-trillion chance of a collision
Remember this is the "birthday problem". You're not trying to generate a collision for a single given hash, but rather measuring the probability that none of the many pairs of values will collide.
So with N tables, there are N*(N-1)/2 pairs, so here about 10^16 pairs. If the probability of a collision is 2^-64, the probability of a single pair not colliding is (1-2^-64), but with so many pairs, the probability of having no collisions here is about (1-2^-64)^(10^16), or more like 1/10,000. See eg https://preshing.com/20110504/hash-collision-probabilities/
And if it's only a 32bit hash the probablity of a collision crosses 1/2 at only 77k values.
add a comment |
Assuming I have 100 million tables, I calculate less than a 1-in-1-trillion chance of a collision
Remember this is the "birthday problem". You're not trying to generate a collision for a single given hash, but rather measuring the probability that none of the many pairs of values will collide.
So with N tables, there are N*(N-1)/2 pairs, so here about 10^16 pairs. If the probability of a collision is 2^-64, the probability of a single pair not colliding is (1-2^-64), but with so many pairs, the probability of having no collisions here is about (1-2^-64)^(10^16), or more like 1/10,000. See eg https://preshing.com/20110504/hash-collision-probabilities/
And if it's only a 32bit hash the probablity of a collision crosses 1/2 at only 77k values.
Assuming I have 100 million tables, I calculate less than a 1-in-1-trillion chance of a collision
Remember this is the "birthday problem". You're not trying to generate a collision for a single given hash, but rather measuring the probability that none of the many pairs of values will collide.
So with N tables, there are N*(N-1)/2 pairs, so here about 10^16 pairs. If the probability of a collision is 2^-64, the probability of a single pair not colliding is (1-2^-64), but with so many pairs, the probability of having no collisions here is about (1-2^-64)^(10^16), or more like 1/10,000. See eg https://preshing.com/20110504/hash-collision-probabilities/
And if it's only a 32bit hash the probablity of a collision crosses 1/2 at only 77k values.
answered 7 hours ago
David Browne - MicrosoftDavid Browne - Microsoft
12.9k834
12.9k834
add a comment |
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f236768%2fcan-sql-server-create-collisions-in-system-generated-constraint-names%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Loving the question! But wouldn't the tablenameprefix be different for each table?
– Randi Vertongen
12 hours ago
The table names are different but they use a naming convention which results in at least the first 11 characters being the same, and that seems to be all SQL Server uses in generating the constraint name.
– jl6
12 hours ago
Is the underlying hardware the same (number of cores and speed)? It could be that greater throughput increases the likelihood rather than a change in the algorithm.
– Dan Guzman
11 hours ago
The underlying hardware is different (newer generation of DL380) but not significantly higher performance. The aim of the exercise is to replace the out of support SQL Server 2008, not to improve throughput, and the hardware has been provisioned accordingly.
– jl6
11 hours ago
@Martin Smith - I’ve done some testing which seems to suggest that the first 8 bytes of the hex in the constraint name depend only on the name of the primary key column. The last 8 bytes look random.
– jl6
10 hours ago