Fastest way to pop N items from a large dict The Next CEO of Stack OverflowWhat is the fastest...

MessageLevel in QGIS3

Novel about a guy who is possessed by the divine essence and the world ends?

Is micro rebar a better way to reinforce concrete than rebar?

Elegant way to replace substring in a regex with optional groups in Python?

How do I go from 300 unfinished/half written blog posts, to published posts?

Preparing Indesign booklet with .psd graphics for print

Written every which way

Why do airplanes bank sharply to the right after air-to-air refueling?

Which tube will fit a -(700 x 25c) wheel?

Can you replace a racial trait cantrip when leveling up?

What can we do to stop prior company from asking us questions?

Would this house-rule that treats advantage as a +1 to the roll instead (and disadvantage as -1) and allows them to stack be balanced?

Is there a difference between "Fahrstuhl" and "Aufzug"

Inappropriate reference requests from Journal reviewers

Unreliable Magic - Is it worth it?

Does it take more energy to get to Venus or to Mars?

Are there any limitations on attacking while grappling?

Sending manuscript to multiple publishers

Why am I allowed to create multiple unique pointers from a single object?

What happened in Rome, when the western empire "fell"?

Why didn't Khan get resurrected in the Genesis Explosion?

Is it possible to search for a directory/file combination?

What connection does MS Office have to Netscape Navigator?

Do I need to enable Dev Hub in my PROD Org?

Fastest way to pop N items from a large dict

The Next CEO of Stack OverflowWhat is the fastest way to get the value of π?Fastest way to convert string to integer in PHPFastest way to determine if an integer's square root is an integerHow to randomly select an item from a list?How to remove items from a list while iterating?Fastest way to list all primes below NHow to remove the first Item from a list?Fastest way to check if a value exist in a listIs there any pythonic way to combine two dicts (adding values for keys that appear in both)?Fastest way to determine if an integer is between two integers (inclusive) with known sets of values

I have a large dict src (up to 1M items) and I would like to take N (typical values would be N=10K-20K) items, store them in a new dict dst and leave only the remaining items in src. It doesn't matter which N items are taken. I'm looking for the fastest way to do it on Python 3.6 or 3.7.

Fastest approach I've found so far:

src = {i: i ** 3 for i in range(1000000)}



# Taking items 1 by 1 (~0.0059s)

dst = {}

while len(dst) < 20000:

    item = src.popitem()

    dst[item[0]] = item[1]

Is there anything better? Even a marginal gain would be good.

edited Mar 16 at 17:16

martineau

69.8k1092186

asked Mar 16 at 17:00

Ivailo Karamanolev

570615

add a comment |

Fastest approach I've found so far:

src = {i: i ** 3 for i in range(1000000)}



# Taking items 1 by 1 (~0.0059s)

dst = {}

while len(dst) < 20000:

    item = src.popitem()

    dst[item[0]] = item[1]

Is there anything better? Even a marginal gain would be good.

edited Mar 16 at 17:16

martineau

69.8k1092186

asked Mar 16 at 17:00

Ivailo Karamanolev

570615

add a comment |

Fastest approach I've found so far:

src = {i: i ** 3 for i in range(1000000)}



# Taking items 1 by 1 (~0.0059s)

dst = {}

while len(dst) < 20000:

    item = src.popitem()

    dst[item[0]] = item[1]

Is there anything better? Even a marginal gain would be good.

edited Mar 16 at 17:16

martineau

69.8k1092186

asked Mar 16 at 17:00

Ivailo Karamanolev

570615

Fastest approach I've found so far:

src = {i: i ** 3 for i in range(1000000)}



# Taking items 1 by 1 (~0.0059s)

dst = {}

while len(dst) < 20000:

    item = src.popitem()

    dst[item[0]] = item[1]

Is there anything better? Even a marginal gain would be good.

python python-3.x performance dictionary optimization

edited Mar 16 at 17:16

martineau

69.8k1092186

asked Mar 16 at 17:00

Ivailo Karamanolev

570615

edited Mar 16 at 17:16

martineau

69.8k1092186

asked Mar 16 at 17:00

Ivailo Karamanolev

570615

edited Mar 16 at 17:16

martineau

69.8k1092186

edited Mar 16 at 17:16

martineau

69.8k1092186

edited Mar 16 at 17:16

martineau

69.8k1092186

asked Mar 16 at 17:00

Ivailo Karamanolev

570615

asked Mar 16 at 17:00

Ivailo Karamanolev

570615

asked Mar 16 at 17:00

Ivailo Karamanolev

570615

add a comment |

3 Answers
3

active

oldest

votes

This is a bit faster still:

from itertools import islice

def method_4(d):

    result = dict(islice(d.items(), 20000))

    for k in result: del d[k]

    return result

Compared to other versions, using Netwave's testcase:

Method 1:  0.004459443036466837  # original

Method 2:  0.0034434819826856256 # Netwave

Method 3:  0.002602717955596745  # chepner

Method 4:  0.001974945073015988  # this answer

The extra speedup seems to come from avoiding transitions between C and Python functions. From disassembly we can note that the dict instantiation happens on C side, with only 3 function calls from Python. The loop uses DELETE_SUBSCR opcode instead of needing a function call:

>>> dis.dis(method_4)

  2           0 LOAD_GLOBAL              0 (dict)

              2 LOAD_GLOBAL              1 (islice)

              4 LOAD_FAST                0 (d)

              6 LOAD_ATTR                2 (items)

              8 CALL_FUNCTION            0

             10 LOAD_CONST               1 (20000)

             12 CALL_FUNCTION            2

             14 CALL_FUNCTION            1

             16 STORE_FAST               1 (result)



  3          18 SETUP_LOOP              18 (to 38)

             20 LOAD_FAST                1 (result)

             22 GET_ITER

        >>   24 FOR_ITER                10 (to 36)

             26 STORE_FAST               2 (k)

             28 LOAD_FAST                0 (d)

             30 LOAD_FAST                2 (k)

             32 DELETE_SUBSCR

             34 JUMP_ABSOLUTE           24

        >>   36 POP_BLOCK



  4     >>   38 LOAD_FAST                1 (result)

             40 RETURN_VALUE

Compared with the iterator in method_2:

>>> dis.dis(d.popitem() for _ in range(20000))

  1           0 LOAD_FAST                0 (.0)

        >>    2 FOR_ITER                14 (to 18)

              4 STORE_FAST               1 (_)

              6 LOAD_GLOBAL              0 (d)

              8 LOAD_ATTR                1 (popitem)

             10 CALL_FUNCTION            0

             12 YIELD_VALUE

             14 POP_TOP

             16 JUMP_ABSOLUTE            2

        >>   18 LOAD_CONST               0 (None)

             20 RETURN_VALUE

which needs a Python to C function call for each item.

edited Mar 17 at 11:55

answered Mar 16 at 20:51

jpa

5,4681226

I was researching this! what a sync!

– Netwave
Mar 17 at 11:58

@Netwave Do you think this should now be the accepted answer?

– Ivailo Karamanolev
Mar 17 at 18:43

@IvailoKaramanolev, yes, we were searching for the fastest, and indeed this on is.

– Netwave
Mar 18 at 3:13

add a comment |

A simple comprehension inside dict will do:

dict(src.popitem() for _ in range(20000))

Here you have the timing tests

setup = """

src = {i: i ** 3 for i in range(1000000)}



def method_1(d):

  dst = {}

  while len(dst) < 20000:

      item = d.popitem()

      dst[item[0]] = item[1]

  return dst



def method_2(d):

  return dict(d.popitem() for _ in range(20000))

"""

import timeit

print("Method 1: ", timeit.timeit('method_1(src)', setup=setup, number=1))



print("Method 2: ", timeit.timeit('method_2(src)', setup=setup, number=1))

Results:

Method 1:  0.007701821999944514

Method 2:  0.004668198998842854

edited Mar 17 at 11:57

answered Mar 16 at 17:16

Netwave

13.5k22246

3

so much for dict comprehension. Good one using dict!!!

– Jean-François Fabre♦
Mar 16 at 17:18

2

Thank you! Hopefully next moderator @Jean-FrançoisFabre ;)

– Netwave
Mar 16 at 17:19

1

for my solution, at least, I've upped all 10 times and it's still faster. The key is to make the shorter code possible and rely on native functions.

– Jean-François Fabre♦
Mar 16 at 17:23

4

You can shave a little more time off by saving the bound method first. f = d.popitem; return dict(f() for _ in range(20000)).

– chepner
Mar 16 at 17:37

3

Using itertools.islice and itertools.repeat is even a little faster still: dict(f() for f in islice(repeat(d.popitem), 20000)).

– chepner
Mar 16 at 17:49

|
show 5 more comments

I found this approach slightly faster (-10% speed) using dictionary comprehension that consumes a loop using range that yields & unpacks the keys & values

dst = {key:value for key,value in (src.popitem() for _ in range(20000))}

on my machine:

your code: 0.00899505615234375

my code:   0.007996797561645508

so about 12% faster, not bad but not as good as not unpacking like Netwave simpler answer

This approach can be useful if you want to transform the keys or values in the process.

edited Mar 16 at 17:22

answered Mar 16 at 17:15

Jean-François Fabre♦

106k1057115

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55199303%2ffastest-way-to-pop-n-items-from-a-large-dict%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

This is a bit faster still:

from itertools import islice

def method_4(d):

    result = dict(islice(d.items(), 20000))

    for k in result: del d[k]

    return result

Compared to other versions, using Netwave's testcase:

Method 1:  0.004459443036466837  # original

Method 2:  0.0034434819826856256 # Netwave

Method 3:  0.002602717955596745  # chepner

Method 4:  0.001974945073015988  # this answer

>>> dis.dis(method_4)

  2           0 LOAD_GLOBAL              0 (dict)

              2 LOAD_GLOBAL              1 (islice)

              4 LOAD_FAST                0 (d)

              6 LOAD_ATTR                2 (items)

              8 CALL_FUNCTION            0

             10 LOAD_CONST               1 (20000)

             12 CALL_FUNCTION            2

             14 CALL_FUNCTION            1

             16 STORE_FAST               1 (result)



  3          18 SETUP_LOOP              18 (to 38)

             20 LOAD_FAST                1 (result)

             22 GET_ITER

        >>   24 FOR_ITER                10 (to 36)

             26 STORE_FAST               2 (k)

             28 LOAD_FAST                0 (d)

             30 LOAD_FAST                2 (k)

             32 DELETE_SUBSCR

             34 JUMP_ABSOLUTE           24

        >>   36 POP_BLOCK



  4     >>   38 LOAD_FAST                1 (result)

             40 RETURN_VALUE

Compared with the iterator in method_2:

>>> dis.dis(d.popitem() for _ in range(20000))

  1           0 LOAD_FAST                0 (.0)

        >>    2 FOR_ITER                14 (to 18)

              4 STORE_FAST               1 (_)

              6 LOAD_GLOBAL              0 (d)

              8 LOAD_ATTR                1 (popitem)

             10 CALL_FUNCTION            0

             12 YIELD_VALUE

             14 POP_TOP

             16 JUMP_ABSOLUTE            2

        >>   18 LOAD_CONST               0 (None)

             20 RETURN_VALUE

which needs a Python to C function call for each item.

edited Mar 17 at 11:55

answered Mar 16 at 20:51

jpa

5,4681226

I was researching this! what a sync!

– Netwave
Mar 17 at 11:58

@Netwave Do you think this should now be the accepted answer?

– Ivailo Karamanolev
Mar 17 at 18:43

@IvailoKaramanolev, yes, we were searching for the fastest, and indeed this on is.

– Netwave
Mar 18 at 3:13

add a comment |

This is a bit faster still:

from itertools import islice

def method_4(d):

    result = dict(islice(d.items(), 20000))

    for k in result: del d[k]

    return result

Compared to other versions, using Netwave's testcase:

Method 1:  0.004459443036466837  # original

Method 2:  0.0034434819826856256 # Netwave

Method 3:  0.002602717955596745  # chepner

Method 4:  0.001974945073015988  # this answer

>>> dis.dis(method_4)

  2           0 LOAD_GLOBAL              0 (dict)

              2 LOAD_GLOBAL              1 (islice)

              4 LOAD_FAST                0 (d)

              6 LOAD_ATTR                2 (items)

              8 CALL_FUNCTION            0

             10 LOAD_CONST               1 (20000)

             12 CALL_FUNCTION            2

             14 CALL_FUNCTION            1

             16 STORE_FAST               1 (result)



  3          18 SETUP_LOOP              18 (to 38)

             20 LOAD_FAST                1 (result)

             22 GET_ITER

        >>   24 FOR_ITER                10 (to 36)

             26 STORE_FAST               2 (k)

             28 LOAD_FAST                0 (d)

             30 LOAD_FAST                2 (k)

             32 DELETE_SUBSCR

             34 JUMP_ABSOLUTE           24

        >>   36 POP_BLOCK



  4     >>   38 LOAD_FAST                1 (result)

             40 RETURN_VALUE

Compared with the iterator in method_2:

>>> dis.dis(d.popitem() for _ in range(20000))

  1           0 LOAD_FAST                0 (.0)

        >>    2 FOR_ITER                14 (to 18)

              4 STORE_FAST               1 (_)

              6 LOAD_GLOBAL              0 (d)

              8 LOAD_ATTR                1 (popitem)

             10 CALL_FUNCTION            0

             12 YIELD_VALUE

             14 POP_TOP

             16 JUMP_ABSOLUTE            2

        >>   18 LOAD_CONST               0 (None)

             20 RETURN_VALUE

which needs a Python to C function call for each item.

edited Mar 17 at 11:55

answered Mar 16 at 20:51

jpa

5,4681226

I was researching this! what a sync!

– Netwave
Mar 17 at 11:58

@Netwave Do you think this should now be the accepted answer?

– Ivailo Karamanolev
Mar 17 at 18:43

@IvailoKaramanolev, yes, we were searching for the fastest, and indeed this on is.

– Netwave
Mar 18 at 3:13

add a comment |

This is a bit faster still:

from itertools import islice

def method_4(d):

    result = dict(islice(d.items(), 20000))

    for k in result: del d[k]

    return result

Compared to other versions, using Netwave's testcase:

Method 1:  0.004459443036466837  # original

Method 2:  0.0034434819826856256 # Netwave

Method 3:  0.002602717955596745  # chepner

Method 4:  0.001974945073015988  # this answer

>>> dis.dis(method_4)

  2           0 LOAD_GLOBAL              0 (dict)

              2 LOAD_GLOBAL              1 (islice)

              4 LOAD_FAST                0 (d)

              6 LOAD_ATTR                2 (items)

              8 CALL_FUNCTION            0

             10 LOAD_CONST               1 (20000)

             12 CALL_FUNCTION            2

             14 CALL_FUNCTION            1

             16 STORE_FAST               1 (result)



  3          18 SETUP_LOOP              18 (to 38)

             20 LOAD_FAST                1 (result)

             22 GET_ITER

        >>   24 FOR_ITER                10 (to 36)

             26 STORE_FAST               2 (k)

             28 LOAD_FAST                0 (d)

             30 LOAD_FAST                2 (k)

             32 DELETE_SUBSCR

             34 JUMP_ABSOLUTE           24

        >>   36 POP_BLOCK



  4     >>   38 LOAD_FAST                1 (result)

             40 RETURN_VALUE

Compared with the iterator in method_2:

>>> dis.dis(d.popitem() for _ in range(20000))

  1           0 LOAD_FAST                0 (.0)

        >>    2 FOR_ITER                14 (to 18)

              4 STORE_FAST               1 (_)

              6 LOAD_GLOBAL              0 (d)

              8 LOAD_ATTR                1 (popitem)

             10 CALL_FUNCTION            0

             12 YIELD_VALUE

             14 POP_TOP

             16 JUMP_ABSOLUTE            2

        >>   18 LOAD_CONST               0 (None)

             20 RETURN_VALUE

which needs a Python to C function call for each item.

edited Mar 17 at 11:55

answered Mar 16 at 20:51

jpa

5,4681226

This is a bit faster still:

from itertools import islice

def method_4(d):

    result = dict(islice(d.items(), 20000))

    for k in result: del d[k]

    return result

Compared to other versions, using Netwave's testcase:

Method 1:  0.004459443036466837  # original

Method 2:  0.0034434819826856256 # Netwave

Method 3:  0.002602717955596745  # chepner

Method 4:  0.001974945073015988  # this answer

>>> dis.dis(method_4)

  2           0 LOAD_GLOBAL              0 (dict)

              2 LOAD_GLOBAL              1 (islice)

              4 LOAD_FAST                0 (d)

              6 LOAD_ATTR                2 (items)

              8 CALL_FUNCTION            0

             10 LOAD_CONST               1 (20000)

             12 CALL_FUNCTION            2

             14 CALL_FUNCTION            1

             16 STORE_FAST               1 (result)



  3          18 SETUP_LOOP              18 (to 38)

             20 LOAD_FAST                1 (result)

             22 GET_ITER

        >>   24 FOR_ITER                10 (to 36)

             26 STORE_FAST               2 (k)

             28 LOAD_FAST                0 (d)

             30 LOAD_FAST                2 (k)

             32 DELETE_SUBSCR

             34 JUMP_ABSOLUTE           24

        >>   36 POP_BLOCK



  4     >>   38 LOAD_FAST                1 (result)

             40 RETURN_VALUE

Compared with the iterator in method_2:

>>> dis.dis(d.popitem() for _ in range(20000))

  1           0 LOAD_FAST                0 (.0)

        >>    2 FOR_ITER                14 (to 18)

              4 STORE_FAST               1 (_)

              6 LOAD_GLOBAL              0 (d)

              8 LOAD_ATTR                1 (popitem)

             10 CALL_FUNCTION            0

             12 YIELD_VALUE

             14 POP_TOP

             16 JUMP_ABSOLUTE            2

        >>   18 LOAD_CONST               0 (None)

             20 RETURN_VALUE

which needs a Python to C function call for each item.

edited Mar 17 at 11:55

answered Mar 16 at 20:51

jpa

5,4681226

edited Mar 17 at 11:55

answered Mar 16 at 20:51

jpa

5,4681226

answered Mar 16 at 20:51

jpa

5,4681226

answered Mar 16 at 20:51

jpa

5,4681226

I was researching this! what a sync!

– Netwave
Mar 17 at 11:58

@Netwave Do you think this should now be the accepted answer?

– Ivailo Karamanolev
Mar 17 at 18:43

@IvailoKaramanolev, yes, we were searching for the fastest, and indeed this on is.

– Netwave
Mar 18 at 3:13

add a comment |

I was researching this! what a sync!

– Netwave
Mar 17 at 11:58

@Netwave Do you think this should now be the accepted answer?

– Ivailo Karamanolev
Mar 17 at 18:43

@IvailoKaramanolev, yes, we were searching for the fastest, and indeed this on is.

– Netwave
Mar 18 at 3:13

I was researching this! what a sync!

– Netwave
Mar 17 at 11:58

@Netwave Do you think this should now be the accepted answer?

– Ivailo Karamanolev
Mar 17 at 18:43

@IvailoKaramanolev, yes, we were searching for the fastest, and indeed this on is.

– Netwave
Mar 18 at 3:13

add a comment |

A simple comprehension inside dict will do:

dict(src.popitem() for _ in range(20000))

Here you have the timing tests

setup = """

src = {i: i ** 3 for i in range(1000000)}



def method_1(d):

  dst = {}

  while len(dst) < 20000:

      item = d.popitem()

      dst[item[0]] = item[1]

  return dst



def method_2(d):

  return dict(d.popitem() for _ in range(20000))

"""

import timeit

print("Method 1: ", timeit.timeit('method_1(src)', setup=setup, number=1))



print("Method 2: ", timeit.timeit('method_2(src)', setup=setup, number=1))

Results:

Method 1:  0.007701821999944514

Method 2:  0.004668198998842854

edited Mar 17 at 11:57

answered Mar 16 at 17:16

Netwave

13.5k22246

3

so much for dict comprehension. Good one using dict!!!

– Jean-François Fabre♦
Mar 16 at 17:18

2

Thank you! Hopefully next moderator @Jean-FrançoisFabre ;)

– Netwave
Mar 16 at 17:19

1

for my solution, at least, I've upped all 10 times and it's still faster. The key is to make the shorter code possible and rely on native functions.

– Jean-François Fabre♦
Mar 16 at 17:23

4

You can shave a little more time off by saving the bound method first. f = d.popitem; return dict(f() for _ in range(20000)).

– chepner
Mar 16 at 17:37

3

Using itertools.islice and itertools.repeat is even a little faster still: dict(f() for f in islice(repeat(d.popitem), 20000)).

– chepner
Mar 16 at 17:49

|
show 5 more comments

A simple comprehension inside dict will do:

dict(src.popitem() for _ in range(20000))

Here you have the timing tests

setup = """

src = {i: i ** 3 for i in range(1000000)}



def method_1(d):

  dst = {}

  while len(dst) < 20000:

      item = d.popitem()

      dst[item[0]] = item[1]

  return dst



def method_2(d):

  return dict(d.popitem() for _ in range(20000))

"""

import timeit

print("Method 1: ", timeit.timeit('method_1(src)', setup=setup, number=1))



print("Method 2: ", timeit.timeit('method_2(src)', setup=setup, number=1))

Results:

Method 1:  0.007701821999944514

Method 2:  0.004668198998842854

edited Mar 17 at 11:57

answered Mar 16 at 17:16

Netwave

13.5k22246

3

so much for dict comprehension. Good one using dict!!!

– Jean-François Fabre♦
Mar 16 at 17:18

2

Thank you! Hopefully next moderator @Jean-FrançoisFabre ;)

– Netwave
Mar 16 at 17:19

1

for my solution, at least, I've upped all 10 times and it's still faster. The key is to make the shorter code possible and rely on native functions.

– Jean-François Fabre♦
Mar 16 at 17:23

4

You can shave a little more time off by saving the bound method first. f = d.popitem; return dict(f() for _ in range(20000)).

– chepner
Mar 16 at 17:37

3

Using itertools.islice and itertools.repeat is even a little faster still: dict(f() for f in islice(repeat(d.popitem), 20000)).

– chepner
Mar 16 at 17:49

|
show 5 more comments

A simple comprehension inside dict will do:

dict(src.popitem() for _ in range(20000))

Here you have the timing tests

setup = """

src = {i: i ** 3 for i in range(1000000)}



def method_1(d):

  dst = {}

  while len(dst) < 20000:

      item = d.popitem()

      dst[item[0]] = item[1]

  return dst



def method_2(d):

  return dict(d.popitem() for _ in range(20000))

"""

import timeit

print("Method 1: ", timeit.timeit('method_1(src)', setup=setup, number=1))



print("Method 2: ", timeit.timeit('method_2(src)', setup=setup, number=1))

Results:

Method 1:  0.007701821999944514

Method 2:  0.004668198998842854

edited Mar 17 at 11:57

answered Mar 16 at 17:16

Netwave

13.5k22246

A simple comprehension inside dict will do:

dict(src.popitem() for _ in range(20000))

Here you have the timing tests

setup = """

src = {i: i ** 3 for i in range(1000000)}



def method_1(d):

  dst = {}

  while len(dst) < 20000:

      item = d.popitem()

      dst[item[0]] = item[1]

  return dst



def method_2(d):

  return dict(d.popitem() for _ in range(20000))

"""

import timeit

print("Method 1: ", timeit.timeit('method_1(src)', setup=setup, number=1))



print("Method 2: ", timeit.timeit('method_2(src)', setup=setup, number=1))

Results:

Method 1:  0.007701821999944514

Method 2:  0.004668198998842854

edited Mar 17 at 11:57

answered Mar 16 at 17:16

Netwave

13.5k22246

edited Mar 17 at 11:57

answered Mar 16 at 17:16

Netwave

13.5k22246

answered Mar 16 at 17:16

Netwave

13.5k22246

answered Mar 16 at 17:16

Netwave

13.5k22246

3

so much for dict comprehension. Good one using dict!!!

– Jean-François Fabre♦
Mar 16 at 17:18

2

Thank you! Hopefully next moderator @Jean-FrançoisFabre ;)

– Netwave
Mar 16 at 17:19

1

for my solution, at least, I've upped all 10 times and it's still faster. The key is to make the shorter code possible and rely on native functions.

– Jean-François Fabre♦
Mar 16 at 17:23

4

You can shave a little more time off by saving the bound method first. f = d.popitem; return dict(f() for _ in range(20000)).

– chepner
Mar 16 at 17:37

3

Using itertools.islice and itertools.repeat is even a little faster still: dict(f() for f in islice(repeat(d.popitem), 20000)).

– chepner
Mar 16 at 17:49

|
show 5 more comments

3

so much for dict comprehension. Good one using dict!!!

– Jean-François Fabre♦
Mar 16 at 17:18

2

Thank you! Hopefully next moderator @Jean-FrançoisFabre ;)

– Netwave
Mar 16 at 17:19

1

for my solution, at least, I've upped all 10 times and it's still faster. The key is to make the shorter code possible and rely on native functions.

– Jean-François Fabre♦
Mar 16 at 17:23

4

You can shave a little more time off by saving the bound method first. f = d.popitem; return dict(f() for _ in range(20000)).

– chepner
Mar 16 at 17:37

3

Using itertools.islice and itertools.repeat is even a little faster still: dict(f() for f in islice(repeat(d.popitem), 20000)).

– chepner
Mar 16 at 17:49

so much for dict comprehension. Good one using dict!!!

– Jean-François Fabre♦
Mar 16 at 17:18

Thank you! Hopefully next moderator @Jean-FrançoisFabre ;)

– Netwave
Mar 16 at 17:19

for my solution, at least, I've upped all 10 times and it's still faster. The key is to make the shorter code possible and rely on native functions.

– Jean-François Fabre♦
Mar 16 at 17:23

You can shave a little more time off by saving the bound method first. f = d.popitem; return dict(f() for _ in range(20000)).

– chepner
Mar 16 at 17:37

Using itertools.islice and itertools.repeat is even a little faster still: dict(f() for f in islice(repeat(d.popitem), 20000)).

– chepner
Mar 16 at 17:49

|
show 5 more comments

I found this approach slightly faster (-10% speed) using dictionary comprehension that consumes a loop using range that yields & unpacks the keys & values

dst = {key:value for key,value in (src.popitem() for _ in range(20000))}

on my machine:

your code: 0.00899505615234375

my code:   0.007996797561645508

so about 12% faster, not bad but not as good as not unpacking like Netwave simpler answer

This approach can be useful if you want to transform the keys or values in the process.

edited Mar 16 at 17:22

answered Mar 16 at 17:15

Jean-François Fabre♦

106k1057115

add a comment |

I found this approach slightly faster (-10% speed) using dictionary comprehension that consumes a loop using range that yields & unpacks the keys & values

dst = {key:value for key,value in (src.popitem() for _ in range(20000))}

on my machine:

your code: 0.00899505615234375

my code:   0.007996797561645508

so about 12% faster, not bad but not as good as not unpacking like Netwave simpler answer

This approach can be useful if you want to transform the keys or values in the process.

edited Mar 16 at 17:22

answered Mar 16 at 17:15

Jean-François Fabre♦

106k1057115

add a comment |

I found this approach slightly faster (-10% speed) using dictionary comprehension that consumes a loop using range that yields & unpacks the keys & values

dst = {key:value for key,value in (src.popitem() for _ in range(20000))}

on my machine:

your code: 0.00899505615234375

my code:   0.007996797561645508

so about 12% faster, not bad but not as good as not unpacking like Netwave simpler answer

This approach can be useful if you want to transform the keys or values in the process.

edited Mar 16 at 17:22

answered Mar 16 at 17:15

Jean-François Fabre♦

106k1057115

I found this approach slightly faster (-10% speed) using dictionary comprehension that consumes a loop using range that yields & unpacks the keys & values

dst = {key:value for key,value in (src.popitem() for _ in range(20000))}

on my machine:

your code: 0.00899505615234375

my code:   0.007996797561645508

so about 12% faster, not bad but not as good as not unpacking like Netwave simpler answer

This approach can be useful if you want to transform the keys or values in the process.

edited Mar 16 at 17:22

answered Mar 16 at 17:15

Jean-François Fabre♦

106k1057115

edited Mar 16 at 17:22

answered Mar 16 at 17:15

Jean-François Fabre♦

106k1057115

answered Mar 16 at 17:15

Jean-François Fabre♦

106k1057115

answered Mar 16 at 17:15

Jean-François Fabre♦

106k1057115

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Dtjytyk