{"input": "What does the \"yield\" keyword do? What is the use of the yield keyword in Python? What does it do?\nFor example, I'm trying to understand this code1:\ndef _get_child_candidates(self, distance, min_dist, max_dist):\n if self._leftchild and distance - max_dist < self._median:\n yield self._leftchild\n if self._rightchild and distance + max_dist >= self._median:\n yield self._rightchild \n\nAnd this is the caller:\nresult, candidates = list(), [self]\nwhile candidates:\n node = candidates.pop()\n distance = node._get_dist(obj)\n if distance <= max_dist and distance >= min_dist:\n result.extend(node._values)\n candidates.extend(node._get_child_candidates(distance, min_dist, max_dist))\nreturn result\n\nWhat happens when the method _get_child_candidates is called?\nA list is returned? A single element is returned? Is it called again? When will subsequent calls stop?\n\n\n1. The code comes from Jochen Schulz (jrschulz), who made a great Python library for metric spaces. This is the link to the complete source: Module mspace.\n", "output": "To understand what yield does, you must understand what generators are. And before generators come iterables.\nIterables\nWhen you create a list, you can read its items one by one. Reading its items one by one is called iteration:\n>>> mylist = [1, 2, 3]\n>>> for i in mylist:\n... print(i)\n1\n2\n3\n\nmylist is an iterable. When you use a list comprehension, you create a list, and so an iterable:\n>>> mylist = [x*x for x in range(3)]\n>>> for i in mylist:\n... print(i)\n0\n1\n4\n\nEverything you can use \"for... in...\" on is an iterable; lists, strings, files...\nThese iterables are handy because you can read them as much as you wish, but you store all the values in memory and this is not always what you want when you have a lot of values.\nGenerators\nGenerators are iterators, but you can only iterate over them once. It's because they do not store all the values in memory, they generate the values on the fly:\n>>> mygenerator = (x*x for x in range(3))\n>>> for i in mygenerator:\n... print(i)\n0\n1\n4\n\nIt is just the same except you used () instead of []. BUT, you cannot perform for i in mygenerator a second time since generators can only be used once: they calculate 0, then forget about it and calculate 1, and end calculating 4, one by one.\nYield\nYield is a keyword that is used like return, except the function will return a generator.\n>>> def createGenerator():\n... mylist = range(3)\n... for i in mylist:\n... yield i*i\n...\n>>> mygenerator = createGenerator() # create a generator\n>>> print(mygenerator) # mygenerator is an object!\n\n>>> for i in mygenerator:\n... print(i)\n0\n1\n4\n\nHere it's a useless example, but it's handy when you know your function will return a huge set of values that you will only need to read once.\nTo master yield, you must understand that when you call the function, the code you have written in the function body does not run. The function only returns the generator object, this is a bit tricky :-)\nThen, your code will be run each time the for uses the generator.\nNow the hard part:\nThe first time the for calls the generator object created from your function, it will run the code in your function from the beginning until it hits yield, then it'll return the first value of the loop. Then, each other call will run the loop you have written in the function one more time, and return the next value, until there is no value to return.\nThe generator is considered empty once the function runs but does not hit yield anymore. It can be because the loop had come to an end, or because you do not satisfy an \"if/else\" anymore.\n\nYour code explained\nGenerator:\n# Here you create the method of the node object that will return the generator\ndef node._get_child_candidates(self, distance, min_dist, max_dist):\n\n # Here is the code that will be called each time you use the generator object:\n\n # If there is still a child of the node object on its left\n # AND if distance is ok, return the next child\n if self._leftchild and distance - max_dist < self._median:\n yield self._leftchild\n\n # If there is still a child of the node object on its right\n # AND if distance is ok, return the next child\n if self._rightchild and distance + max_dist >= self._median:\n yield self._rightchild\n\n # If the function arrives here, the generator will be considered empty\n # there is no more than two values: the left and the right children\n\nCaller:\n# Create an empty list and a list with the current object reference\nresult, candidates = list(), [self]\n\n# Loop on candidates (they contain only one element at the beginning)\nwhile candidates:\n\n # Get the last candidate and remove it from the list\n node = candidates.pop()\n\n # Get the distance between obj and the candidate\n distance = node._get_dist(obj)\n\n # If distance is ok, then you can fill the result\n if distance <= max_dist and distance >= min_dist:\n result.extend(node._values)\n\n # Add the children of the candidate in the candidates list\n # so the loop will keep running until it will have looked\n # at all the children of the children of the children, etc. of the candidate\n candidates.extend(node._get_child_candidates(distance, min_dist, max_dist))\n\nreturn result\n\nThis code contains several smart parts:\n\nThe loop iterates on a list but the list expands while the loop is being iterated :-) It's a concise way to go through all these nested data even if it's a bit dangerous since you can end up with an infinite loop. In this case, candidates.extend(node._get_child_candidates(distance, min_dist, max_dist)) exhausts all the values of the generator, but while keeps creating new generator objects which will produce different values from the previous ones since it's not applied on the same node.\nThe extend() method is a list object method that expects an iterable and adds its values to the list.\n\nUsually we pass a list to it:\n>>> a = [1, 2]\n>>> b = [3, 4]\n>>> a.extend(b)\n>>> print(a)\n[1, 2, 3, 4]\n\nBut in your code it gets a generator, which is good because:\n\nYou don't need to read the values twice.\nYou may have a lot of children and you don't want them all stored in memory.\n\nAnd it works because Python does not care if the argument of a method is a list or not. Python expects iterables so it will work with strings, lists, tuples and generators! This is called duck typing and is one of the reason why Python is so cool. But this is another story, for another question...\nYou can stop here, or read a little bit to see an advanced use of a generator:\nControlling a generator exhaustion\n>>> class Bank(): # let's create a bank, building ATMs\n... crisis = False\n... def create_atm(self):\n... while not self.crisis:\n... yield \"$100\"\n>>> hsbc = Bank() # when everything's ok the ATM gives you as much as you want\n>>> corner_street_atm = hsbc.create_atm()\n>>> print(corner_street_atm.next())\n$100\n>>> print(corner_street_atm.next())\n$100\n>>> print([corner_street_atm.next() for cash in range(5)])\n['$100', '$100', '$100', '$100', '$100']\n>>> hsbc.crisis = True # crisis is coming, no more money!\n>>> print(corner_street_atm.next())\n\n>>> wall_street_atm = hsbc.create_atm() # it's even true for new ATMs\n>>> print(wall_street_atm.next())\n\n>>> hsbc.crisis = False # trouble is, even post-crisis the ATM remains empty\n>>> print(corner_street_atm.next())\n\n>>> brand_new_atm = hsbc.create_atm() # build a new one to get back in business\n>>> for cash in brand_new_atm:\n... print cash\n$100\n$100\n$100\n$100\n$100\n$100\n$100\n$100\n$100\n...\n\nIt can be useful for various things like controlling access to a resource.\nItertools, your best friend\nThe itertools module contains special functions to manipulate iterables. Ever wish to duplicate a generator?\nChain two generators? Group values in a nested list with a one liner? Map / Zip without creating another list?\nThen just import itertools.\nAn example? Let's see the possible orders of arrival for a 4 horse race:\n>>> horses = [1, 2, 3, 4]\n>>> races = itertools.permutations(horses)\n>>> print(races)\n\n>>> print(list(itertools.permutations(horses)))\n[(1, 2, 3, 4),\n (1, 2, 4, 3),\n (1, 3, 2, 4),\n (1, 3, 4, 2),\n (1, 4, 2, 3),\n (1, 4, 3, 2),\n (2, 1, 3, 4),\n (2, 1, 4, 3),\n (2, 3, 1, 4),\n (2, 3, 4, 1),\n (2, 4, 1, 3),\n (2, 4, 3, 1),\n (3, 1, 2, 4),\n (3, 1, 4, 2),\n (3, 2, 1, 4),\n (3, 2, 4, 1),\n (3, 4, 1, 2),\n (3, 4, 2, 1),\n (4, 1, 2, 3),\n (4, 1, 3, 2),\n (4, 2, 1, 3),\n (4, 2, 3, 1),\n (4, 3, 1, 2),\n (4, 3, 2, 1)]\n\nUnderstanding the inner mechanisms of iteration\nIteration is a process implying iterables (implementing the __iter__() method) and iterators (implementing the __next__() method).\nIterables are any objects you can get an iterator from. Iterators are objects that let you iterate on iterables.\nMore about it in this article about how does the for loop work.\n"} {"input": "What is a metaclass in Python? What are metaclasses? What do you use them for?\n", "output": "Classes as objects\nBefore understanding metaclasses, you need to master classes in Python. And Python has a very peculiar idea of what classes are, borrowed from the Smalltalk language.\nIn most languages, classes are just pieces of code that describe how to produce an object. That's kinda true in Python too:\n>>> class ObjectCreator(object):\n... pass\n... \n\n>>> my_object = ObjectCreator()\n>>> print(my_object)\n<__main__.ObjectCreator object at 0x8974f2c>\n\nBut classes are more than that in Python. Classes are objects too.\nYes, objects. \nAs soon as you use the keyword class, Python executes it and creates\nan OBJECT. The instruction\n>>> class ObjectCreator(object):\n... pass\n... \n\ncreates in memory an object with the name \"ObjectCreator\". \nThis object (the class) is itself capable of creating objects (the instances), \nand this is why it's a class. \nBut still, it's an object, and therefore:\n\nyou can assign it to a variable\nyou can copy it\nyou can add attributes to it\nyou can pass it as a function parameter\n\ne.g.:\n>>> print(ObjectCreator) # you can print a class because it's an object\n\n>>> def echo(o):\n... print(o)\n... \n>>> echo(ObjectCreator) # you can pass a class as a parameter\n\n>>> print(hasattr(ObjectCreator, 'new_attribute'))\nFalse\n>>> ObjectCreator.new_attribute = 'foo' # you can add attributes to a class\n>>> print(hasattr(ObjectCreator, 'new_attribute'))\nTrue\n>>> print(ObjectCreator.new_attribute)\nfoo\n>>> ObjectCreatorMirror = ObjectCreator # you can assign a class to a variable\n>>> print(ObjectCreatorMirror.new_attribute)\nfoo\n>>> print(ObjectCreatorMirror())\n<__main__.ObjectCreator object at 0x8997b4c>\n\nCreating classes dynamically\nSince classes are objects, you can create them on the fly, like any object.\nFirst, you can create a class in a function using class:\n>>> def choose_class(name):\n... if name == 'foo':\n... class Foo(object):\n... pass\n... return Foo # return the class, not an instance\n... else:\n... class Bar(object):\n... pass\n... return Bar\n... \n>>> MyClass = choose_class('foo') \n>>> print(MyClass) # the function returns a class, not an instance\n\n>>> print(MyClass()) # you can create an object from this class\n<__main__.Foo object at 0x89c6d4c>\n\nBut it's not so dynamic, since you still have to write the whole class yourself.\nSince classes are objects, they must be generated by something.\nWhen you use the class keyword, Python creates this object automatically. But as\nwith most things in Python, it gives you a way to do it manually.\nRemember the function type? The good old function that lets you know what \ntype an object is:\n>>> print(type(1))\n\n>>> print(type(\"1\"))\n\n>>> print(type(ObjectCreator))\n\n>>> print(type(ObjectCreator()))\n\n\nWell, type has a completely different ability, it can also create classes on the fly. type can take the description of a class as parameters, \nand return a class.\n(I know, it's silly that the same function can have two completely different uses according to the parameters you pass to it. It's an issue due to backwards \ncompatibility in Python)\ntype works this way:\ntype(name of the class, \n tuple of the parent class (for inheritance, can be empty), \n dictionary containing attributes names and values)\n\ne.g.:\n>>> class MyShinyClass(object):\n... pass\n\ncan be created manually this way:\n>>> MyShinyClass = type('MyShinyClass', (), {}) # returns a class object\n>>> print(MyShinyClass)\n\n>>> print(MyShinyClass()) # create an instance with the class\n<__main__.MyShinyClass object at 0x8997cec>\n\nYou'll notice that we use \"MyShinyClass\" as the name of the class\nand as the variable to hold the class reference. They can be different,\nbut there is no reason to complicate things.\ntype accepts a dictionary to define the attributes of the class. So:\n>>> class Foo(object):\n... bar = True\n\nCan be translated to:\n>>> Foo = type('Foo', (), {'bar':True})\n\nAnd used as a normal class:\n>>> print(Foo)\n\n>>> print(Foo.bar)\nTrue\n>>> f = Foo()\n>>> print(f)\n<__main__.Foo object at 0x8a9b84c>\n>>> print(f.bar)\nTrue\n\nAnd of course, you can inherit from it, so:\n>>> class FooChild(Foo):\n... pass\n\nwould be:\n>>> FooChild = type('FooChild', (Foo,), {})\n>>> print(FooChild)\n\n>>> print(FooChild.bar) # bar is inherited from Foo\nTrue\n\nEventually you'll want to add methods to your class. Just define a function\nwith the proper signature and assign it as an attribute.\n>>> def echo_bar(self):\n... print(self.bar)\n... \n>>> FooChild = type('FooChild', (Foo,), {'echo_bar': echo_bar})\n>>> hasattr(Foo, 'echo_bar')\nFalse\n>>> hasattr(FooChild, 'echo_bar')\nTrue\n>>> my_foo = FooChild()\n>>> my_foo.echo_bar()\nTrue\n\nAnd you can add even more methods after you dynamically create the class, just like adding methods to a normally created class object.\n>>> def echo_bar_more(self):\n... print('yet another method')\n... \n>>> FooChild.echo_bar_more = echo_bar_more\n>>> hasattr(FooChild, 'echo_bar_more')\nTrue\n\nYou see where we are going: in Python, classes are objects, and you can create a class on the fly, dynamically.\nThis is what Python does when you use the keyword class, and it does so by using a metaclass.\nWhat are metaclasses (finally)\nMetaclasses are the 'stuff' that creates classes.\nYou define classes in order to create objects, right?\nBut we learned that Python classes are objects.\nWell, metaclasses are what create these objects. They are the classes' classes,\nyou can picture them this way:\nMyClass = MetaClass()\nMyObject = MyClass()\n\nYou've seen that type lets you do something like this:\nMyClass = type('MyClass', (), {})\n\nIt's because the function type is in fact a metaclass. type is the \nmetaclass Python uses to create all classes behind the scenes.\nNow you wonder why the heck is it written in lowercase, and not Type?\nWell, I guess it's a matter of consistency with str, the class that creates\nstrings objects, and int the class that creates integer objects. type is\njust the class that creates class objects.\nYou see that by checking the __class__ attribute. \nEverything, and I mean everything, is an object in Python. That includes ints, \nstrings, functions and classes. All of them are objects. And all of them have\nbeen created from a class:\n>>> age = 35\n>>> age.__class__\n\n>>> name = 'bob'\n>>> name.__class__\n\n>>> def foo(): pass\n>>> foo.__class__\n\n>>> class Bar(object): pass\n>>> b = Bar()\n>>> b.__class__\n\n\nNow, what is the __class__ of any __class__ ?\n>>> age.__class__.__class__\n\n>>> name.__class__.__class__\n\n>>> foo.__class__.__class__\n\n>>> b.__class__.__class__\n\n\nSo, a metaclass is just the stuff that creates class objects.\nYou can call it a 'class factory' if you wish.\ntype is the built-in metaclass Python uses, but of course, you can create your\nown metaclass.\nThe __metaclass__ attribute\nYou can add a __metaclass__ attribute when you write a class:\nclass Foo(object):\n __metaclass__ = something...\n [...]\n\nIf you do so, Python will use the metaclass to create the class Foo.\nCareful, it's tricky.\nYou write class Foo(object) first, but the class object Foo is not created\nin memory yet.\nPython will look for __metaclass__ in the class definition. If it finds it,\nit will use it to create the object class Foo. If it doesn't, it will use\ntype to create the class.\nRead that several times.\nWhen you do:\nclass Foo(Bar):\n pass\n\nPython does the following:\nIs there a __metaclass__ attribute in Foo?\nIf yes, create in memory a class object (I said a class object, stay with me here), with the name Foo by using what is in __metaclass__.\nIf Python can't find __metaclass__, it will look for a __metaclass__ at the MODULE level, and try to do the same (but only for classes that don't inherit anything, basically old-style classes). \nThen if it can't find any __metaclass__ at all, it will use the Bar's (the first parent) own metaclass (which might be the default type) to create the class object.\nBe careful here that the __metaclass__ attribute will not be inherited, the metaclass of the parent (Bar.__class__) will be. If Bar used a __metaclass__ attribute that created Bar with type() (and not type.__new__()), the subclasses will not inherit that behavior.\nNow the big question is, what can you put in __metaclass__ ?\nThe answer is: something that can create a class.\nAnd what can create a class? type, or anything that subclasses or uses it.\nCustom metaclasses\nThe main purpose of a metaclass is to change the class automatically,\nwhen it's created.\nYou usually do this for APIs, where you want to create classes matching the\ncurrent context.\nImagine a stupid example, where you decide that all classes in your module\nshould have their attributes written in uppercase. There are several ways to \ndo this, but one way is to set __metaclass__ at the module level.\nThis way, all classes of this module will be created using this metaclass, \nand we just have to tell the metaclass to turn all attributes to uppercase.\nLuckily, __metaclass__ can actually be any callable, it doesn't need to be a\nformal class (I know, something with 'class' in its name doesn't need to be \na class, go figure... but it's helpful).\nSo we will start with a simple example, by using a function.\n# the metaclass will automatically get passed the same argument\n# that you usually pass to `type`\ndef upper_attr(future_class_name, future_class_parents, future_class_attr):\n \"\"\"\n Return a class object, with the list of its attribute turned \n into uppercase.\n \"\"\"\n\n # pick up any attribute that doesn't start with '__' and uppercase it\n uppercase_attr = {}\n for name, val in future_class_attr.items():\n if not name.startswith('__'):\n uppercase_attr[name.upper()] = val\n else:\n uppercase_attr[name] = val\n\n # let `type` do the class creation\n return type(future_class_name, future_class_parents, uppercase_attr)\n\n__metaclass__ = upper_attr # this will affect all classes in the module\n\nclass Foo(): # global __metaclass__ won't work with \"object\" though\n # but we can define __metaclass__ here instead to affect only this class\n # and this will work with \"object\" children\n bar = 'bip'\n\nprint(hasattr(Foo, 'bar'))\n# Out: False\nprint(hasattr(Foo, 'BAR'))\n# Out: True\n\nf = Foo()\nprint(f.BAR)\n# Out: 'bip'\n\nNow, let's do exactly the same, but using a real class for a metaclass:\n# remember that `type` is actually a class like `str` and `int`\n# so you can inherit from it\nclass UpperAttrMetaclass(type): \n # __new__ is the method called before __init__\n # it's the method that creates the object and returns it\n # while __init__ just initializes the object passed as parameter\n # you rarely use __new__, except when you want to control how the object\n # is created.\n # here the created object is the class, and we want to customize it\n # so we override __new__\n # you can do some stuff in __init__ too if you wish\n # some advanced use involves overriding __call__ as well, but we won't\n # see this\n def __new__(upperattr_metaclass, future_class_name, \n future_class_parents, future_class_attr):\n\n uppercase_attr = {}\n for name, val in future_class_attr.items():\n if not name.startswith('__'):\n uppercase_attr[name.upper()] = val\n else:\n uppercase_attr[name] = val\n\n return type(future_class_name, future_class_parents, uppercase_attr)\n\nBut this is not really OOP. We call type directly and we don't override\nor call the parent __new__. Let's do it:\nclass UpperAttrMetaclass(type): \n\n def __new__(upperattr_metaclass, future_class_name, \n future_class_parents, future_class_attr):\n\n uppercase_attr = {}\n for name, val in future_class_attr.items():\n if not name.startswith('__'):\n uppercase_attr[name.upper()] = val\n else:\n uppercase_attr[name] = val\n\n # reuse the type.__new__ method\n # this is basic OOP, nothing magic in there\n return type.__new__(upperattr_metaclass, future_class_name, \n future_class_parents, uppercase_attr)\n\nYou may have noticed the extra argument upperattr_metaclass. There is\nnothing special about it: __new__ always receives the class it's defined in, as first parameter. Just like you have self for ordinary methods which receive the instance as first parameter, or the defining class for class methods.\nOf course, the names I used here are long for the sake of clarity, but like\nfor self, all the arguments have conventional names. So a real production\nmetaclass would look like this:\nclass UpperAttrMetaclass(type): \n\n def __new__(cls, clsname, bases, dct):\n\n uppercase_attr = {}\n for name, val in dct.items():\n if not name.startswith('__'):\n uppercase_attr[name.upper()] = val\n else:\n uppercase_attr[name] = val\n\n return type.__new__(cls, clsname, bases, uppercase_attr)\n\nWe can make it even cleaner by using super, which will ease inheritance (because yes, you can have metaclasses, inheriting from metaclasses, inheriting from type):\nclass UpperAttrMetaclass(type): \n\n def __new__(cls, clsname, bases, dct):\n\n uppercase_attr = {}\n for name, val in dct.items():\n if not name.startswith('__'):\n uppercase_attr[name.upper()] = val\n else:\n uppercase_attr[name] = val\n\n return super(UpperAttrMetaclass, cls).__new__(cls, clsname, bases, uppercase_attr)\n\nThat's it. There is really nothing more about metaclasses.\nThe reason behind the complexity of the code using metaclasses is not because\nof metaclasses, it's because you usually use metaclasses to do twisted stuff\nrelying on introspection, manipulating inheritance, vars such as __dict__, etc.\nIndeed, metaclasses are especially useful to do black magic, and therefore\ncomplicated stuff. But by themselves, they are simple:\n\nintercept a class creation\nmodify the class\nreturn the modified class\n\nWhy would you use metaclasses classes instead of functions?\nSince __metaclass__ can accept any callable, why would you use a class\nsince it's obviously more complicated?\nThere are several reasons to do so:\n\nThe intention is clear. When you read UpperAttrMetaclass(type), you know\nwhat's going to follow\nYou can use OOP. Metaclass can inherit from metaclass, override parent methods. Metaclasses can even use metaclasses.\nYou can structure your code better. You never use metaclasses for something as\ntrivial as the above example. It's usually for something complicated. Having the\nability to make several methods and group them in one class is very useful\nto make the code easier to read.\nYou can hook on __new__, __init__ and __call__. Which will allow\nyou to do different stuff. Even if usually you can do it all in __new__, \nsome people are just more comfortable using __init__.\nThese are called metaclasses, damn it! It must mean something!\n\nWhy would you use metaclasses?\nNow the big question. Why would you use some obscure error prone feature?\nWell, usually you don't:\n\nMetaclasses are deeper magic that\n 99% of users should never worry about.\n If you wonder whether you need them,\n you don't (the people who actually\n need them know with certainty that\n they need them, and don't need an\n explanation about why). \n\nPython Guru Tim Peters\nThe main use case for a metaclass is creating an API. A typical example of this is the Django ORM.\nIt allows you to define something like this:\nclass Person(models.Model):\n name = models.CharField(max_length=30)\n age = models.IntegerField()\n\nBut if you do this:\nguy = Person(name='bob', age='35')\nprint(guy.age)\n\nIt won't return an IntegerField object. It will return an int, and can even take it directly from the database.\nThis is possible because models.Model defines __metaclass__ and \nit uses some magic that will turn the Person you just defined with simple statements\ninto a complex hook to a database field. \nDjango makes something complex look simple by exposing a simple API\nand using metaclasses, recreating code from this API to do the real job\nbehind the scenes.\nThe last word\nFirst, you know that classes are objects that can create instances.\nWell in fact, classes are themselves instances. Of metaclasses.\n>>> class Foo(object): pass\n>>> id(Foo)\n142630324\n\nEverything is an object in Python, and they are all either instances of classes\nor instances of metaclasses.\nExcept for type.\ntype is actually its own metaclass. This is not something you could\nreproduce in pure Python, and is done by cheating a little bit at the implementation\nlevel.\nSecondly, metaclasses are complicated. You may not want to use them for \nvery simple class alterations. You can change classes by using two different techniques:\n\nmonkey patching\nclass decorators\n\n99% of the time you need class alteration, you are better off using these.\nBut 99% of the time, you don't need class alteration at all.\n"} {"input": "How do I check whether a file exists using Python? How do I check whether a file exists, without using the try statement?\n", "output": "You can also use os.path.isfile\n\nReturn True if path is an existing regular file. This follows symbolic links, so both islink() and isfile() can be true for the same path.\n\nimport os.path\nos.path.isfile(fname) \n\nif you need to be sure it's a file.\nStarting with Python 3.4, the pathlib module offers an object-oriented approach:\nfrom pathlib import Path\n\nmy_file = Path(\"/path/to/file\")\nif my_file.is_file():\n # file exists\n\n"} {"input": "Does Python have a ternary conditional operator? If Python does not have a ternary conditional operator, is it possible to simulate one using other language constructs?\n", "output": "Yes, it was added in version 2.5.\nThe syntax is:\na if condition else b\n\nFirst condition is evaluated, then either a or b is returned based on the Boolean value of condition\nIf condition evaluates to True a is returned, else b is returned. \nFor example:\n>>> 'true' if True else 'false'\n'true'\n>>> 'true' if False else 'false'\n'false'\n\nKeep in mind that it's frowned upon by some Pythonistas for several reasons:\n\nThe order of the arguments is different from many other languages (such as C, Ruby, Java, etc.), which may lead to bugs when people unfamiliar with Python's \"surprising\" behaviour use it (they may reverse the order).\nSome find it \"unwieldy\", since it goes contrary to the normal flow of thought (thinking of the condition first and then the effects).\nStylistic reasons.\n\nIf you're having trouble remembering the order, then remember that if you read it out loud, you (almost) say what you mean. For example, x = 4 if b > 8 else 9 is read aloud as x will be 4 if b is greater than 8 otherwise 9.\nOfficial documentation:\n\nConditional expressions\nIs there an equivalent of C\u00e2\u0080\u0099s \u00e2\u0080\u009d?:\u00e2\u0080\u009d ternary operator?\n\n"} {"input": "Calling an external command in Python How can I call an external command (as if I'd typed it at the Unix shell or Windows command prompt) from within a Python script?\n", "output": "Look at the subprocess module in the stdlib:\nfrom subprocess import call\ncall([\"ls\", \"-l\"])\n\nThe advantage of subprocess vs system is that it is more flexible (you can get the stdout, stderr, the \"real\" status code, better error handling, etc...). I think os.system is deprecated, too, or will be:\nhttps://docs.python.org/2/library/subprocess.html#replacing-older-functions-with-the-subprocess-module\nFor quick/dirty/one time scripts, os.system is enough, though.\n"} {"input": "What does `if __name__ == \"__main__\":` do? What does the if __name__ == \"__main__\": do?\n# Threading example\nimport time, thread\n\ndef myfunction(string, sleeptime, lock, *args):\n while 1:\n lock.acquire()\n time.sleep(sleeptime)\n lock.release()\n time.sleep(sleeptime)\nif __name__ == \"__main__\":\n lock = thread.allocate_lock()\n thread.start_new_thread(myfunction, (\"Thread #: 1\", 2, lock))\n thread.start_new_thread(myfunction, (\"Thread #: 2\", 2, lock))\n\n", "output": "When the Python interpreter reads a source file, it executes all of the code found in it. \nBefore executing the code, it will define a few special variables. For example, if the python interpreter is running that module (the source file) as the main program, it sets the special __name__ variable to have a value \"__main__\". If this file is being imported from another module, __name__ will be set to the module's name.\nIn the case of your script, let's assume that it's executing as the main function, e.g. you said something like\npython threading_example.py\n\non the command line. After setting up the special variables, it will execute the import statement and load those modules. It will then evaluate the def block, creating a function object and creating a variable called myfunction that points to the function object. It will then read the if statement and see that __name__ does equal \"__main__\", so it will execute the block shown there.\nOne of the reasons for doing this is that sometimes you write a module (a .py file) where it can be executed directly. Alternatively, it can also be imported and used in another module. By doing the main check, you can have that code only execute when you want to run the module as a program and not have it execute when someone just wants to import your module and call your functions themselves.\nSee this page for some extra details.\n"} {"input": "How to make a chain of function decorators in Python? How can I make two decorators in Python that would do the following?\n@makebold\n@makeitalic\ndef say():\n return \"Hello\"\n\n...which should return:\n\"Hello\"\n\nI'm not trying to make HTML this way in a real application - just trying to understand how decorators and decorator chaining works.\n", "output": "If you are not into long explanations, see Paolo Bergantino\u00e2\u0080\u0099s answer.\nDecorator Basics\nPython\u00e2\u0080\u0099s functions are objects\nTo understand decorators, you must first understand that functions are objects in Python. This has important consequences. Let\u00e2\u0080\u0099s see why with a simple example :\ndef shout(word=\"yes\"):\n return word.capitalize()+\"!\"\n\nprint(shout())\n# outputs : 'Yes!'\n\n# As an object, you can assign the function to a variable like any other object \nscream = shout\n\n# Notice we don't use parentheses: we are not calling the function, we are putting the function \"shout\" into the variable \"scream\". It means you can then call \"shout\" from \"scream\":\n\nprint(scream())\n# outputs : 'Yes!'\n\n# More than that, it means you can remove the old name 'shout', and the function will still be accessible from 'scream'\n\ndel shout\ntry:\n print(shout())\nexcept NameError, e:\n print(e)\n #outputs: \"name 'shout' is not defined\"\n\nprint(scream())\n# outputs: 'Yes!'\n\nKeep this in mind. We\u00e2\u0080\u0099ll circle back to it shortly. \nAnother interesting property of Python functions is they can be defined inside another function!\ndef talk():\n\n # You can define a function on the fly in \"talk\" ...\n def whisper(word=\"yes\"):\n return word.lower()+\"...\"\n\n # ... and use it right away!\n print(whisper())\n\n# You call \"talk\", that defines \"whisper\" EVERY TIME you call it, then\n# \"whisper\" is called in \"talk\". \ntalk()\n# outputs: \n# \"yes...\"\n\n# But \"whisper\" DOES NOT EXIST outside \"talk\":\n\ntry:\n print(whisper())\nexcept NameError, e:\n print(e)\n #outputs : \"name 'whisper' is not defined\"*\n #Python's functions are objects\n\nFunctions references\nOkay, still here? Now the fun part...\nYou\u00e2\u0080\u0099ve seen that functions are objects. Therefore, functions:\n\ncan be assigned to a variable\ncan be defined in another function\n\nThat means that a function can return another function.\ndef getTalk(kind=\"shout\"):\n\n # We define functions on the fly\n def shout(word=\"yes\"):\n return word.capitalize()+\"!\"\n\n def whisper(word=\"yes\") :\n return word.lower()+\"...\";\n\n # Then we return one of them\n if kind == \"shout\":\n # We don't use \"()\", we are not calling the function, we are returning the function object\n return shout \n else:\n return whisper\n\n# How do you use this strange beast?\n\n# Get the function and assign it to a variable\ntalk = getTalk() \n\n# You can see that \"talk\" is here a function object:\nprint(talk)\n#outputs : \n\n# The object is the one returned by the function:\nprint(talk())\n#outputs : Yes!\n\n# And you can even use it directly if you feel wild:\nprint(getTalk(\"whisper\")())\n#outputs : yes...\n\nThere\u00e2\u0080\u0099s more! \nIf you can return a function, you can pass one as a parameter:\ndef doSomethingBefore(func): \n print(\"I do something before then I call the function you gave me\")\n print(func())\n\ndoSomethingBefore(scream)\n#outputs: \n#I do something before then I call the function you gave me\n#Yes!\n\nWell, you just have everything needed to understand decorators. You see, decorators are \u00e2\u0080\u009cwrappers\u00e2\u0080\u009d, which means that they let you execute code before and after the function they decorate without modifying the function itself.\nHandcrafted decorators\nHow you\u00e2\u0080\u0099d do it manually:\n# A decorator is a function that expects ANOTHER function as parameter\ndef my_shiny_new_decorator(a_function_to_decorate):\n\n # Inside, the decorator defines a function on the fly: the wrapper.\n # This function is going to be wrapped around the original function\n # so it can execute code before and after it.\n def the_wrapper_around_the_original_function():\n\n # Put here the code you want to be executed BEFORE the original function is called\n print(\"Before the function runs\")\n\n # Call the function here (using parentheses)\n a_function_to_decorate()\n\n # Put here the code you want to be executed AFTER the original function is called\n print(\"After the function runs\")\n\n # At this point, \"a_function_to_decorate\" HAS NEVER BEEN EXECUTED.\n # We return the wrapper function we have just created.\n # The wrapper contains the function and the code to execute before and after. It\u00e2\u0080\u0099s ready to use!\n return the_wrapper_around_the_original_function\n\n# Now imagine you create a function you don't want to ever touch again.\ndef a_stand_alone_function():\n print(\"I am a stand alone function, don't you dare modify me\")\n\na_stand_alone_function() \n#outputs: I am a stand alone function, don't you dare modify me\n\n# Well, you can decorate it to extend its behavior.\n# Just pass it to the decorator, it will wrap it dynamically in \n# any code you want and return you a new function ready to be used:\n\na_stand_alone_function_decorated = my_shiny_new_decorator(a_stand_alone_function)\na_stand_alone_function_decorated()\n#outputs:\n#Before the function runs\n#I am a stand alone function, don't you dare modify me\n#After the function runs\n\nNow, you probably want that every time you call a_stand_alone_function, a_stand_alone_function_decorated is called instead. That\u00e2\u0080\u0099s easy, just overwrite a_stand_alone_function with the function returned by my_shiny_new_decorator:\na_stand_alone_function = my_shiny_new_decorator(a_stand_alone_function)\na_stand_alone_function()\n#outputs:\n#Before the function runs\n#I am a stand alone function, don't you dare modify me\n#After the function runs\n\n# That\u00e2\u0080\u0099s EXACTLY what decorators do!\n\nDecorators demystified\nThe previous example, using the decorator syntax:\n@my_shiny_new_decorator\ndef another_stand_alone_function():\n print(\"Leave me alone\")\n\nanother_stand_alone_function() \n#outputs: \n#Before the function runs\n#Leave me alone\n#After the function runs\n\nYes, that\u00e2\u0080\u0099s all, it\u00e2\u0080\u0099s that simple. @decorator is just a shortcut to:\nanother_stand_alone_function = my_shiny_new_decorator(another_stand_alone_function)\n\nDecorators are just a pythonic variant of the decorator design pattern. There are several classic design patterns embedded in Python to ease development (like iterators).\nOf course, you can accumulate decorators:\ndef bread(func):\n def wrapper():\n print(\"\")\n func()\n print(\"<\\______/>\")\n return wrapper\n\ndef ingredients(func):\n def wrapper():\n print(\"#tomatoes#\")\n func()\n print(\"~salad~\")\n return wrapper\n\ndef sandwich(food=\"--ham--\"):\n print(food)\n\nsandwich()\n#outputs: --ham--\nsandwich = bread(ingredients(sandwich))\nsandwich()\n#outputs:\n#\n# #tomatoes#\n# --ham--\n# ~salad~\n#<\\______/>\n\nUsing the Python decorator syntax:\n@bread\n@ingredients\ndef sandwich(food=\"--ham--\"):\n print(food)\n\nsandwich()\n#outputs:\n#\n# #tomatoes#\n# --ham--\n# ~salad~\n#<\\______/>\n\nThe order you set the decorators MATTERS:\n@ingredients\n@bread\ndef strange_sandwich(food=\"--ham--\"):\n print(food)\n\nstrange_sandwich()\n#outputs:\n##tomatoes#\n#\n# --ham--\n#<\\______/>\n# ~salad~\n\n\nNow: to answer the question...\nAs a conclusion, you can easily see how to answer the question:\n# The decorator to make it bold\ndef makebold(fn):\n # The new function the decorator returns\n def wrapper():\n # Insertion of some code before and after\n return \"\" + fn() + \"\"\n return wrapper\n\n# The decorator to make it italic\ndef makeitalic(fn):\n # The new function the decorator returns\n def wrapper():\n # Insertion of some code before and after\n return \"\" + fn() + \"\"\n return wrapper\n\n@makebold\n@makeitalic\ndef say():\n return \"hello\"\n\nprint(say())\n#outputs: hello\n\n# This is the exact equivalent to \ndef say():\n return \"hello\"\nsay = makebold(makeitalic(say))\n\nprint(say())\n#outputs: hello\n\nYou can now just leave happy, or burn your brain a little bit more and see advanced uses of decorators.\n\nTaking decorators to the next level\nPassing arguments to the decorated function\n# It\u00e2\u0080\u0099s not black magic, you just have to let the wrapper \n# pass the argument:\n\ndef a_decorator_passing_arguments(function_to_decorate):\n def a_wrapper_accepting_arguments(arg1, arg2):\n print(\"I got args! Look: {0}, {1}\".format(arg1, arg2))\n function_to_decorate(arg1, arg2)\n return a_wrapper_accepting_arguments\n\n# Since when you are calling the function returned by the decorator, you are\n# calling the wrapper, passing arguments to the wrapper will let it pass them to \n# the decorated function\n\n@a_decorator_passing_arguments\ndef print_full_name(first_name, last_name):\n print(\"My name is {0} {1}\".format(first_name, last_name))\n\nprint_full_name(\"Peter\", \"Venkman\")\n# outputs:\n#I got args! Look: Peter Venkman\n#My name is Peter Venkman\n\nDecorating methods\nOne nifty thing about Python is that methods and functions are really the same. The only difference is that methods expect that their first argument is a reference to the current object (self). \nThat means you can build a decorator for methods the same way! Just remember to take self into consideration:\ndef method_friendly_decorator(method_to_decorate):\n def wrapper(self, lie):\n lie = lie - 3 # very friendly, decrease age even more :-)\n return method_to_decorate(self, lie)\n return wrapper\n\n\nclass Lucy(object):\n\n def __init__(self):\n self.age = 32\n\n @method_friendly_decorator\n def sayYourAge(self, lie):\n print(\"I am {0}, what did you think?\".format(self.age + lie))\n\nl = Lucy()\nl.sayYourAge(-3)\n#outputs: I am 26, what did you think?\n\nIf you\u00e2\u0080\u0099re making general-purpose decorator--one you\u00e2\u0080\u0099ll apply to any function or method, no matter its arguments--then just use *args, **kwargs:\ndef a_decorator_passing_arbitrary_arguments(function_to_decorate):\n # The wrapper accepts any arguments\n def a_wrapper_accepting_arbitrary_arguments(*args, **kwargs):\n print(\"Do I have args?:\")\n print(args)\n print(kwargs)\n # Then you unpack the arguments, here *args, **kwargs\n # If you are not familiar with unpacking, check:\n # http://www.saltycrane.com/blog/2008/01/how-to-use-args-and-kwargs-in-python/\n function_to_decorate(*args, **kwargs)\n return a_wrapper_accepting_arbitrary_arguments\n\n@a_decorator_passing_arbitrary_arguments\ndef function_with_no_argument():\n print(\"Python is cool, no argument here.\")\n\nfunction_with_no_argument()\n#outputs\n#Do I have args?:\n#()\n#{}\n#Python is cool, no argument here.\n\n@a_decorator_passing_arbitrary_arguments\ndef function_with_arguments(a, b, c):\n print(a, b, c)\n\nfunction_with_arguments(1,2,3)\n#outputs\n#Do I have args?:\n#(1, 2, 3)\n#{}\n#1 2 3 \n\n@a_decorator_passing_arbitrary_arguments\ndef function_with_named_arguments(a, b, c, platypus=\"Why not ?\"):\n print(\"Do {0}, {1} and {2} like platypus? {3}\".format(a, b, c, platypus))\n\nfunction_with_named_arguments(\"Bill\", \"Linus\", \"Steve\", platypus=\"Indeed!\")\n#outputs\n#Do I have args ? :\n#('Bill', 'Linus', 'Steve')\n#{'platypus': 'Indeed!'}\n#Do Bill, Linus and Steve like platypus? Indeed!\n\nclass Mary(object):\n\n def __init__(self):\n self.age = 31\n\n @a_decorator_passing_arbitrary_arguments\n def sayYourAge(self, lie=-3): # You can now add a default value\n print(\"I am {0}, what did you think?\".format(self.age + lie))\n\nm = Mary()\nm.sayYourAge()\n#outputs\n# Do I have args?:\n#(<__main__.Mary object at 0xb7d303ac>,)\n#{}\n#I am 28, what did you think?\n\nPassing arguments to the decorator\nGreat, now what would you say about passing arguments to the decorator itself? \nThis can get somewhat twisted, since a decorator must accept a function as an argument. Therefore, you cannot pass the decorated function\u00e2\u0080\u0099s arguments directly to the decorator.\nBefore rushing to the solution, let\u00e2\u0080\u0099s write a little reminder: \n# Decorators are ORDINARY functions\ndef my_decorator(func):\n print(\"I am an ordinary function\")\n def wrapper():\n print(\"I am function returned by the decorator\")\n func()\n return wrapper\n\n# Therefore, you can call it without any \"@\"\n\ndef lazy_function():\n print(\"zzzzzzzz\")\n\ndecorated_function = my_decorator(lazy_function)\n#outputs: I am an ordinary function\n\n# It outputs \"I am an ordinary function\", because that\u00e2\u0080\u0099s just what you do:\n# calling a function. Nothing magic.\n\n@my_decorator\ndef lazy_function():\n print(\"zzzzzzzz\")\n\n#outputs: I am an ordinary function\n\nIt\u00e2\u0080\u0099s exactly the same. \"my_decorator\" is called. So when you @my_decorator, you are telling Python to call the function 'labelled by the variable \"my_decorator\"'. \nThis is important! The label you give can point directly to the decorator\u00e2\u0080\u0094or not. \nLet\u00e2\u0080\u0099s get evil. \u00e2\u0098\u00ba\ndef decorator_maker():\n\n print(\"I make decorators! I am executed only once: \"\n \"when you make me create a decorator.\")\n\n def my_decorator(func):\n\n print(\"I am a decorator! I am executed only when you decorate a function.\")\n\n def wrapped():\n print(\"I am the wrapper around the decorated function. \"\n \"I am called when you call the decorated function. \"\n \"As the wrapper, I return the RESULT of the decorated function.\")\n return func()\n\n print(\"As the decorator, I return the wrapped function.\")\n\n return wrapped\n\n print(\"As a decorator maker, I return a decorator\")\n return my_decorator\n\n# Let\u00e2\u0080\u0099s create a decorator. It\u00e2\u0080\u0099s just a new function after all.\nnew_decorator = decorator_maker() \n#outputs:\n#I make decorators! I am executed only once: when you make me create a decorator.\n#As a decorator maker, I return a decorator\n\n# Then we decorate the function\n\ndef decorated_function():\n print(\"I am the decorated function.\")\n\ndecorated_function = new_decorator(decorated_function)\n#outputs:\n#I am a decorator! I am executed only when you decorate a function.\n#As the decorator, I return the wrapped function\n\n# Let\u00e2\u0080\u0099s call the function:\ndecorated_function()\n#outputs:\n#I am the wrapper around the decorated function. I am called when you call the decorated function.\n#As the wrapper, I return the RESULT of the decorated function.\n#I am the decorated function.\n\nNo surprise here. \nLet\u00e2\u0080\u0099s do EXACTLY the same thing, but skip all the pesky intermediate variables:\ndef decorated_function():\n print(\"I am the decorated function.\")\ndecorated_function = decorator_maker()(decorated_function)\n#outputs:\n#I make decorators! I am executed only once: when you make me create a decorator.\n#As a decorator maker, I return a decorator\n#I am a decorator! I am executed only when you decorate a function.\n#As the decorator, I return the wrapped function.\n\n# Finally:\ndecorated_function() \n#outputs:\n#I am the wrapper around the decorated function. I am called when you call the decorated function.\n#As the wrapper, I return the RESULT of the decorated function.\n#I am the decorated function.\n\nLet\u00e2\u0080\u0099s make it even shorter:\n@decorator_maker()\ndef decorated_function():\n print(\"I am the decorated function.\")\n#outputs:\n#I make decorators! I am executed only once: when you make me create a decorator.\n#As a decorator maker, I return a decorator\n#I am a decorator! I am executed only when you decorate a function.\n#As the decorator, I return the wrapped function.\n\n#Eventually: \ndecorated_function() \n#outputs:\n#I am the wrapper around the decorated function. I am called when you call the decorated function.\n#As the wrapper, I return the RESULT of the decorated function.\n#I am the decorated function.\n\nHey, did you see that? We used a function call with the \"@\" syntax! :-)\nSo, back to decorators with arguments. If we can use functions to generate the decorator on the fly, we can pass arguments to that function, right?\ndef decorator_maker_with_arguments(decorator_arg1, decorator_arg2):\n\n print(\"I make decorators! And I accept arguments: {0}, {1}\".format(decorator_arg1, decorator_arg2))\n\n def my_decorator(func):\n # The ability to pass arguments here is a gift from closures.\n # If you are not comfortable with closures, you can assume it\u00e2\u0080\u0099s ok,\n # or read: http://stackoverflow.com/questions/13857/can-you-explain-closures-as-they-relate-to-python\n print(\"I am the decorator. Somehow you passed me arguments: {0}, {1}\".format(decorator_arg1, decorator_arg2))\n\n # Don't confuse decorator arguments and function arguments!\n def wrapped(function_arg1, function_arg2) :\n print(\"I am the wrapper around the decorated function.\\n\"\n \"I can access all the variables\\n\"\n \"\\t- from the decorator: {0} {1}\\n\"\n \"\\t- from the function call: {2} {3}\\n\"\n \"Then I can pass them to the decorated function\"\n .format(decorator_arg1, decorator_arg2,\n function_arg1, function_arg2))\n return func(function_arg1, function_arg2)\n\n return wrapped\n\n return my_decorator\n\n@decorator_maker_with_arguments(\"Leonard\", \"Sheldon\")\ndef decorated_function_with_arguments(function_arg1, function_arg2):\n print(\"I am the decorated function and only knows about my arguments: {0}\"\n \" {1}\".format(function_arg1, function_arg2))\n\ndecorated_function_with_arguments(\"Rajesh\", \"Howard\")\n#outputs:\n#I make decorators! And I accept arguments: Leonard Sheldon\n#I am the decorator. Somehow you passed me arguments: Leonard Sheldon\n#I am the wrapper around the decorated function. \n#I can access all the variables \n# - from the decorator: Leonard Sheldon \n# - from the function call: Rajesh Howard \n#Then I can pass them to the decorated function\n#I am the decorated function and only knows about my arguments: Rajesh Howard\n\nHere it is: a decorator with arguments. Arguments can be set as variable:\nc1 = \"Penny\"\nc2 = \"Leslie\"\n\n@decorator_maker_with_arguments(\"Leonard\", c1)\ndef decorated_function_with_arguments(function_arg1, function_arg2):\n print(\"I am the decorated function and only knows about my arguments:\"\n \" {0} {1}\".format(function_arg1, function_arg2))\n\ndecorated_function_with_arguments(c2, \"Howard\")\n#outputs:\n#I make decorators! And I accept arguments: Leonard Penny\n#I am the decorator. Somehow you passed me arguments: Leonard Penny\n#I am the wrapper around the decorated function. \n#I can access all the variables \n# - from the decorator: Leonard Penny \n# - from the function call: Leslie Howard \n#Then I can pass them to the decorated function\n#I am the decorated function and only knows about my arguments: Leslie Howard\n\nAs you can see, you can pass arguments to the decorator like any function using this trick. You can even use *args, **kwargs if you wish. But remember decorators are called only once. Just when Python imports the script. You can't dynamically set the arguments afterwards. When you do \"import x\", the function is already decorated, so you can't\nchange anything.\n\nLet\u00e2\u0080\u0099s practice: decorating a decorator\nOkay, as a bonus, I'll give you a snippet to make any decorator accept generically any argument. After all, in order to accept arguments, we created our decorator using another function. \nWe wrapped the decorator.\nAnything else we saw recently that wrapped function?\nOh yes, decorators!\nLet\u00e2\u0080\u0099s have some fun and write a decorator for the decorators:\ndef decorator_with_args(decorator_to_enhance):\n \"\"\" \n This function is supposed to be used as a decorator.\n It must decorate an other function, that is intended to be used as a decorator.\n Take a cup of coffee.\n It will allow any decorator to accept an arbitrary number of arguments,\n saving you the headache to remember how to do that every time.\n \"\"\"\n\n # We use the same trick we did to pass arguments\n def decorator_maker(*args, **kwargs):\n\n # We create on the fly a decorator that accepts only a function\n # but keeps the passed arguments from the maker.\n def decorator_wrapper(func):\n\n # We return the result of the original decorator, which, after all, \n # IS JUST AN ORDINARY FUNCTION (which returns a function).\n # Only pitfall: the decorator must have this specific signature or it won't work:\n return decorator_to_enhance(func, *args, **kwargs)\n\n return decorator_wrapper\n\n return decorator_maker\n\nIt can be used as follows:\n# You create the function you will use as a decorator. And stick a decorator on it :-)\n# Don't forget, the signature is \"decorator(func, *args, **kwargs)\"\n@decorator_with_args \ndef decorated_decorator(func, *args, **kwargs): \n def wrapper(function_arg1, function_arg2):\n print(\"Decorated with {0} {1}\".format(args, kwargs))\n return func(function_arg1, function_arg2)\n return wrapper\n\n# Then you decorate the functions you wish with your brand new decorated decorator.\n\n@decorated_decorator(42, 404, 1024)\ndef decorated_function(function_arg1, function_arg2):\n print(\"Hello {0} {1}\".format(function_arg1, function_arg2))\n\ndecorated_function(\"Universe and\", \"everything\")\n#outputs:\n#Decorated with (42, 404, 1024) {}\n#Hello Universe and everything\n\n# Whoooot!\n\nI know, the last time you had this feeling, it was after listening a guy saying: \"before understanding recursion, you must first understand recursion\". But now, don't you feel good about mastering this?\n\nBest practices: decorators\n\nDecorators were introduced in Python 2.4, so be sure your code will be run on >= 2.4. \nDecorators slow down the function call. Keep that in mind.\nYou cannot un-decorate a function. (There are hacks to create decorators that can be removed, but nobody uses them.) So once a function is decorated, it\u00e2\u0080\u0099s decorated for all the code.\nDecorators wrap functions, which can make them hard to debug. (This gets better from Python >= 2.5; see below.)\n\nThe functools module was introduced in Python 2.5. It includes the function functools.wraps(), which copies the name, module, and docstring of the decorated function to its wrapper. \n(Fun fact: functools.wraps() is a decorator! \u00e2\u0098\u00ba)\n# For debugging, the stacktrace prints you the function __name__\ndef foo():\n print(\"foo\")\n\nprint(foo.__name__)\n#outputs: foo\n\n# With a decorator, it gets messy \ndef bar(func):\n def wrapper():\n print(\"bar\")\n return func()\n return wrapper\n\n@bar\ndef foo():\n print(\"foo\")\n\nprint(foo.__name__)\n#outputs: wrapper\n\n# \"functools\" can help for that\n\nimport functools\n\ndef bar(func):\n # We say that \"wrapper\", is wrapping \"func\"\n # and the magic begins\n @functools.wraps(func)\n def wrapper():\n print(\"bar\")\n return func()\n return wrapper\n\n@bar\ndef foo():\n print(\"foo\")\n\nprint(foo.__name__)\n#outputs: foo\n\n\nHow can the decorators be useful?\nNow the big question: What can I use decorators for? \nSeem cool and powerful, but a practical example would be great. Well, there are 1000 possibilities. Classic uses are extending a function behavior from an external lib (you can't modify it), or for debugging (you don't want to modify it because it\u00e2\u0080\u0099s temporary). \nYou can use them to extend several functions in a DRY\u00e2\u0080\u0099s way, like so:\ndef benchmark(func):\n \"\"\"\n A decorator that prints the time a function takes\n to execute.\n \"\"\"\n import time\n def wrapper(*args, **kwargs):\n t = time.clock()\n res = func(*args, **kwargs)\n print(\"{0} {1}\".format(func.__name__, time.clock()-t))\n return res\n return wrapper\n\n\ndef logging(func):\n \"\"\"\n A decorator that logs the activity of the script.\n (it actually just prints it, but it could be logging!)\n \"\"\"\n def wrapper(*args, **kwargs):\n res = func(*args, **kwargs)\n print(\"{0} {1} {2}\".format(func.__name__, args, kwargs))\n return res\n return wrapper\n\n\ndef counter(func):\n \"\"\"\n A decorator that counts and prints the number of times a function has been executed\n \"\"\"\n def wrapper(*args, **kwargs):\n wrapper.count = wrapper.count + 1\n res = func(*args, **kwargs)\n print(\"{0} has been used: {1}x\".format(func.__name__, wrapper.count))\n return res\n wrapper.count = 0\n return wrapper\n\n@counter\n@benchmark\n@logging\ndef reverse_string(string):\n return str(reversed(string))\n\nprint(reverse_string(\"Able was I ere I saw Elba\"))\nprint(reverse_string(\"A man, a plan, a canoe, pasta, heros, rajahs, a coloratura, maps, snipe, percale, macaroni, a gag, a banana bag, a tan, a tag, a banana bag again (or a camel), a crepe, pins, Spam, a rut, a Rolo, cash, a jar, sore hats, a peon, a canal: Panama!\"))\n\n#outputs:\n#reverse_string ('Able was I ere I saw Elba',) {}\n#wrapper 0.0\n#wrapper has been used: 1x \n#ablE was I ere I saw elbA\n#reverse_string ('A man, a plan, a canoe, pasta, heros, rajahs, a coloratura, maps, snipe, percale, macaroni, a gag, a banana bag, a tan, a tag, a banana bag again (or a camel), a crepe, pins, Spam, a rut, a Rolo, cash, a jar, sore hats, a peon, a canal: Panama!',) {}\n#wrapper 0.0\n#wrapper has been used: 2x\n#!amanaP :lanac a ,noep a ,stah eros ,raj a ,hsac ,oloR a ,tur a ,mapS ,snip ,eperc a ,)lemac a ro( niaga gab ananab a ,gat a ,nat a ,gab ananab a ,gag a ,inoracam ,elacrep ,epins ,spam ,arutaroloc a ,shajar ,soreh ,atsap ,eonac a ,nalp a ,nam A\n\nOf course the good thing with decorators is that you can use them right away on almost anything without rewriting. DRY, I said:\n@counter\n@benchmark\n@logging\ndef get_random_futurama_quote():\n from urllib import urlopen\n result = urlopen(\"http://subfusion.net/cgi-bin/quote.pl?quote=futurama\").read()\n try:\n value = result.split(\"


\")[1].split(\"


\")[0]\n return value.strip()\n except:\n return \"No, I'm ... doesn't!\"\n\n\nprint(get_random_futurama_quote())\nprint(get_random_futurama_quote())\n\n#outputs:\n#get_random_futurama_quote () {}\n#wrapper 0.02\n#wrapper has been used: 1x\n#The laws of science be a harsh mistress.\n#get_random_futurama_quote () {}\n#wrapper 0.01\n#wrapper has been used: 2x\n#Curse you, merciful Poseidon!\n\nPython itself provides several decorators: property, staticmethod, etc. \n\nDjango uses decorators to manage caching and view permissions. \nTwisted to fake inlining asynchronous functions calls.\n\nThis really is a large playground.\n"} {"input": "How to merge two Python dictionaries in a single expression? I have two Python dictionaries, and I want to write a single expression that returns these two dictionaries, merged. The update() method would be what I need, if it returned its result instead of modifying a dict in-place.\n>>> x = {'a':1, 'b': 2}\n>>> y = {'b':10, 'c': 11}\n>>> z = x.update(y)\n>>> print z\nNone\n>>> x\n{'a': 1, 'b': 10, 'c': 11}\n\nHow can I get that final merged dict in z, not x?\n(To be extra-clear, the last-one-wins conflict-handling of dict.update() is what I'm looking for as well.)\n", "output": "In your case, what you can do is:\nz = dict(x.items() + y.items())\n\nThis will, as you want it, put the final dict in z, and make the value for key b be properly overridden by the second (y) dict's value:\n>>> x = {'a':1, 'b': 2}\n>>> y = {'b':10, 'c': 11}\n>>> z = dict(x.items() + y.items())\n>>> z\n{'a': 1, 'c': 11, 'b': 10}\n\nIf you use Python 3, it is only a little more complicated. To create z:\n>>> z = dict(list(x.items()) + list(y.items()))\n>>> z\n{'a': 1, 'c': 11, 'b': 10}\n\n"} {"input": "Sort a Python dictionary by value I have a dictionary of values read from two fields in a database: a string field and a numeric field. The string field is unique, so that is the key of the dictionary.\nI can sort on the keys, but how can I sort based on the values?\nNote: I have read Stack Overflow question How do I sort a list of dictionaries by values of the dictionary in Python? and probably could change my code to have a list of dictionaries, but since I do not really need a list of dictionaries I wanted to know if there is a simpler solution.\n", "output": "It is not possible to sort a dict, only to get a representation of a dict that is sorted. Dicts are inherently orderless, but other types, such as lists and tuples, are not. So you need a sorted representation, which will be a list\u00e2\u0080\u0094probably a list of tuples.\nFor instance,\nimport operator\nx = {1: 2, 3: 4, 4: 3, 2: 1, 0: 0}\nsorted_x = sorted(x.items(), key=operator.itemgetter(1))\n\nsorted_x will be a list of tuples sorted by the second element in each tuple. dict(sorted_x) == x.\nAnd for those wishing to sort on keys instead of values:\nimport operator\nx = {1: 2, 3: 4, 4: 3, 2: 1, 0: 0}\nsorted_x = sorted(x.items(), key=operator.itemgetter(0))\n\n"} {"input": "How to check if a directory exists and create it if necessary? What is the most elegant way to check if the directory a file is going to be written to exists, and if not, create the directory using Python? Here is what I tried:\nfilename = \"/my/directory/filename.txt\"\ndir = os.path.dirname(filename)\n\ntry:\n os.stat(dir)\nexcept:\n os.mkdir(dir) \n\nf = file(filename)\n\nSomehow, I missed os.path.exists (thanks kanja, Blair, and Douglas). This is what I have now:\ndef ensure_dir(f):\n d = os.path.dirname(f)\n if not os.path.exists(d):\n os.makedirs(d)\n\nIs there a flag for \"open\", that makes this happen automatically?\n", "output": "I see two answers with good qualities, each with a small flaw, so I will give my take on it:\nTry os.path.exists, and consider os.makedirs for the creation.\nif not os.path.exists(directory):\n os.makedirs(directory)\n\nAs noted in comments and elsewhere, there's a race condition - if the directory is created between the os.path.exists and the os.makedirs calls, the os.makedirs will fail with an OSError. Unfortunately, blanket-catching OSError and continuing is not foolproof, as it will ignore a failure to create the directory due to other factors, such as insufficient permissions, full disk, etc.\nOne option would be to trap the OSError and examine the embedded error code, if one knew what's what (on my OS, 13 seems to indicate that permission is denied, and 17 that the file exists - it's not clear that that's even remotely portable, but is explored in Is there a cross-platform way of getting information from Python\u00e2\u0080\u0099s OSError). Alternatively, there could be a second os.path.exists, but suppose another created the directory after the first check, then removed it before the second one - we could still be fooled. \nDepending on the application, the danger of concurrent operations may be more or less than the danger posed by other factors such as file permissions. The developer would have to know more about the particular application being developed and its expected environment before choosing an implementation.\n"} {"input": "Using global variables in a function other than the one that created them If I create a global variable in one function, how can I use that variable in another function?\nDo I need to store the global variable in a local variable of the function which needs its access?\n", "output": "You can use a global variable in other functions by declaring it as global in each function that assigns to it:\nglobvar = 0\n\ndef set_globvar_to_one():\n global globvar # Needed to modify global copy of globvar\n globvar = 1\n\ndef print_globvar():\n print globvar # No need for global declaration to read value of globvar\n\nset_globvar_to_one()\nprint_globvar() # Prints 1\n\nI imagine the reason for it is that, since global variables are so dangerous, Python wants to make sure that you really know that's what you're playing with by explicitly requiring the global keyword.\nSee other answers if you want to share a global variable across modules.\n"} {"input": "How do I install pip on Windows? pip is a replacement for easy_install. But should I install pip using easy_install on Windows? Is there a better way?\n", "output": "Python 2.7.9+ and 3.4+\nGood news! Python 3.4 (released March 2014) and Python 2.7.9 (released December 2014) ship with Pip. This is the best feature of any Python release. It makes the community's wealth of libraries accessible to everyone. Newbies are no longer excluded from using community libraries by the prohibitive difficulty of setup. In shipping with a package manager, Python joins Ruby, Node.js, Haskell, Perl, Go--almost every other contemporary language with a majority open-source community. Thank you Python.\nOf course, that doesn't mean Python packaging is problem solved. The experience remains frustrating. I discuss this in Stack Overflow question Does Python have a package/module management system?.\nAnd, alas for everyone using Python 2.7.8 or earlier (a sizable portion of the community). There's no plan to ship Pip to you. Manual instructions follow.\nPython 2 \u00e2\u0089\u00a4 2.7.8 and Python 3 \u00e2\u0089\u00a4 3.3\nFlying in the face of its 'batteries included' motto, Python ships without a package manager. To make matters worse, Pip was--until recently--ironically difficult to install.\nOfficial instructions\nPer http://www.pip-installer.org/en/latest/installing.html:\nDownload get-pip.py, being careful to save it as a .py file rather than .txt. Then, run it from the command prompt:\npython get-pip.py\n\nYou possibly need an administrator command prompt to do this. Follow Start a Command Prompt as an Administrator (Microsoft TechNet).\nAlternative instructions\nThe official documentation tells users to install Pip and each of its dependencies from source. That's tedious for the experienced, and prohibitively difficult for newbies.\nFor our sake, Christoph Gohlke prepares Windows installers (.msi) for popular Python packages. He builds installers for all Python versions, both 32 and 64 bit. You need to\n\nInstall setuptools\nInstall pip\n\nFor me, this installed Pip at C:\\Python27\\Scripts\\pip.exe. Find pip.exe on your computer, then add its folder (for example, C:\\Python27\\Scripts) to your path (Start / Edit environment variables). Now you should be able to run pip from the command line. Try installing a package:\npip install httpie\n\nThere you go (hopefully)! Solutions for common problems are given below:\nProxy problems\nIf you work in an office, you might be behind a HTTP proxy. If so, set the environment variables http_proxy and https_proxy. Most Python applications (and other free software) respect these. Example syntax:\nhttp://proxy_url:port\nhttp://username:password@proxy_url:port\n\nIf you're really unlucky, your proxy might be a Microsoft NTLM proxy. Free software can't cope. The only solution is to install a free software friendly proxy that forwards to the nasty proxy. http://cntlm.sourceforge.net/\nUnable to find vcvarsall.bat\nPython modules can be part written in C or C++. Pip tries to compile from source. If you don't have a C/C++ compiler installed and configured, you'll see this cryptic error message.\n\nError: Unable to find vcvarsall.bat\n\nYou can fix that by installing a C++ compiler such as MinGW or Visual C++. Microsoft actually ship one specifically for use with Python. Or try Microsoft Visual C++ Compiler for Python 2.7.\nOften though it's easier to check Christoph's site for your package.\n"} {"input": "Best way to check if a list is empty For example, if passed the following:\na = []\n\nHow do I check to see if a is empty?\n", "output": "if not a:\n print(\"List is empty\")\n\nUsing the implicit booleanness of the empty list a is quite pythonic.\n"} {"input": "What is the difference between @staticmethod and @classmethod in Python? What is the difference between a function decorated with @staticmethod and one decorated with @classmethod?\n", "output": "Maybe a bit of example code will help: Notice the difference in the call signatures of foo, class_foo and static_foo:\nclass A(object):\n def foo(self,x):\n print \"executing foo(%s,%s)\"%(self,x)\n\n @classmethod\n def class_foo(cls,x):\n print \"executing class_foo(%s,%s)\"%(cls,x)\n\n @staticmethod\n def static_foo(x):\n print \"executing static_foo(%s)\"%x \n\na=A()\n\nBelow is the usual way an object instance calls a method. The object instance, a, is implicitly passed as the first argument.\na.foo(1)\n# executing foo(<__main__.A object at 0xb7dbef0c>,1)\n\n\nWith classmethods, the class of the object instance is implicitly passed as the first argument instead of self.\na.class_foo(1)\n# executing class_foo(,1)\n\nYou can also call class_foo using the class. In fact, if you define something to be\na classmethod, it is probably because you intend to call it from the class rather than from a class instance. A.foo(1) would have raised a TypeError, but A.class_foo(1) works just fine:\nA.class_foo(1)\n# executing class_foo(,1)\n\nOne use people have found for class methods is to create inheritable alternative constructors.\n\nWith staticmethods, neither self (the object instance) nor cls (the class) is implicitly passed as the first argument. They behave like plain functions except that you can call them from an instance or the class:\na.static_foo(1)\n# executing static_foo(1)\n\nA.static_foo('hi')\n# executing static_foo(hi)\n\nStaticmethods are used to group functions which have some logical connection with a class to the class.\n\nfoo is just a function, but when you call a.foo you don't just get the function,\nyou get a \"partially applied\" version of the function with the object instance a bound as the first argument to the function. foo expects 2 arguments, while a.foo only expects 1 argument.\na is bound to foo. That is what is meant by the term \"bound\" below:\nprint(a.foo)\n# >\n\nWith a.class_foo, a is not bound to class_foo, rather the class A is bound to class_foo.\nprint(a.class_foo)\n# >\n\nHere, with a staticmethod, even though it is a method, a.static_foo just returns\na good 'ole function with no arguments bound. static_foo expects 1 argument, and\na.static_foo expects 1 argument too.\nprint(a.static_foo)\n# \n\nAnd of course the same thing happens when you call static_foo with the class A instead.\nprint(A.static_foo)\n# \n\n"} {"input": "append vs. extend What's the difference between the list methods append() and extend()?\n", "output": "append: Appends object at end.\nx = [1, 2, 3]\nx.append([4, 5])\nprint (x)\n\ngives you: [1, 2, 3, [4, 5]]\n\nextend: Extends list by appending elements from the iterable.\nx = [1, 2, 3]\nx.extend([4, 5])\nprint (x)\n\ngives you: [1, 2, 3, 4, 5]\n"} {"input": "Does Python have a string contains substring method? I'm looking for a string.contains or string.indexof method in Python.\nI want to do:\nif not somestring.contains(\"blah\"):\n continue\n\n", "output": "You can use the in operator:\nif \"blah\" not in somestring: \n continue\n\n"} {"input": "Accessing the index in Python 'for' loops How do I access the index itself for a list like the following?\nints = [8, 23, 45, 12, 78]\n\nWhen I loop through it using a for loop, how do I access the loop index, from 1 to 5 in this case?\n", "output": "Using an additional state variable, such as an index variable (which you would normally use in languages such as C or PHP), is considered non-pythonic.\nThe better option is to use the built-in function enumerate(), available in both Python 2 and 3:\nfor idx, val in enumerate(ints):\n print(idx, val)\n\nCheck out PEP 279 for more.\n"} {"input": "How to list all files of a directory in Python How can I list all files of a directory in python and add them to a list?\n", "output": "os.listdir() will get you everything that's in a directory - files and directories.\nIf you want just files, you could either filter this down using os.path:\nfrom os import listdir\nfrom os.path import isfile, join\nonlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]\n\nor you could use os.walk() which will yield two lists for each directory it visits - splitting into files and dirs for you. If you only want the top directory you can just break the first time it yields\nfrom os import walk\n\nf = []\nfor (dirpath, dirnames, filenames) in walk(mypath):\n f.extend(filenames)\n break\n\nAnd lastly, as that example shows, adding one list to another you can either use .extend() or \n>>> q = [1, 2, 3]\n>>> w = [4, 5, 6]\n>>> q = q + w\n>>> q\n[1, 2, 3, 4, 5, 6]\n\nPersonally, I prefer .extend()\n"} {"input": "\"Least Astonishment\" and the Mutable Default Argument Anyone tinkering with Python long enough has been bitten (or torn to pieces) by the following issue:\ndef foo(a=[]):\n a.append(5)\n return a\n\nPython novices would expect this function to always return a list with only one element: [5]. The result is instead very different, and very astonishing (for a novice):\n>>> foo()\n[5]\n>>> foo()\n[5, 5]\n>>> foo()\n[5, 5, 5]\n>>> foo()\n[5, 5, 5, 5]\n>>> foo()\n\nA manager of mine once had his first encounter with this feature, and called it \"a dramatic design flaw\" of the language. I replied that the behavior had an underlying explanation, and it is indeed very puzzling and unexpected if you don't understand the internals. However, I was not able to answer (to myself) the following question: what is the reason for binding the default argument at function definition, and not at function execution? I doubt the experienced behavior has a practical use (who really used static variables in C, without breeding bugs?)\nEdit: \nBaczek made an interesting example. Together with most of your comments and Utaal's in particular, I elaborated further:\n>>> def a():\n... print \"a executed\"\n... return []\n... \n>>> \n>>> def b(x=a()):\n... x.append(5)\n... print x\n... \na executed\n>>> b()\n[5]\n>>> b()\n[5, 5]\n\nTo me, it seems that the design decision was relative to where to put the scope of parameters: inside the function or \"together\" with it?\nDoing the binding inside the function would mean that x is effectively bound to the specified default when the function is called, not defined, something that would present a deep flaw: the def line would be \"hybrid\" in the sense that part of the binding (of the function object) would happen at definition, and part (assignment of default parameters) at function invocation time.\nThe actual behavior is more consistent: everything of that line gets evaluated when that line is executed, meaning at function definition.\n", "output": "Actually, this is not a design flaw, and it is not because of internals, or performance.\nIt comes simply from the fact that functions in Python are first-class objects, and not only a piece of code.\nAs soon as you get to think into this way, then it completely makes sense: a function is an object being evaluated on its definition; default parameters are kind of \"member data\" and therefore their state may change from one call to the other - exactly as in any other object.\nIn any case, Effbot has a very nice explanation of the reasons for this behavior in Default Parameter Values in Python.\nI found it very clear, and I really suggest reading it for a better knowledge of how function objects work.\n"} {"input": "How do I pass a variable by reference? The Python documentation seems unclear about whether parameters are passed by reference or value, and the following code produces the unchanged value 'Original'\nclass PassByReference:\n def __init__(self):\n self.variable = 'Original'\n self.Change(self.variable)\n print self.variable\n\n def Change(self, var):\n var = 'Changed'\n\nIs there something I can do to pass the variable by actual reference?\n", "output": "Arguments are passed by assignment. The rationale behind this is twofold:\n\nthe parameter passed in is actually a reference to an object (but the reference is passed by value)\nsome data types are mutable, but others aren't\n\nSo:\n\nIf you pass a mutable object into a method, the method gets a reference to that same object and you can mutate it to your heart's delight, but if you rebind the reference in the method, the outer scope will know nothing about it, and after you're done, the outer reference will still point at the original object. \nIf you pass an immutable object to a method, you still can't rebind the outer reference, and you can't even mutate the object.\n\nTo make it even more clear, let's have some examples. \nList - a mutable type\nLet's try to modify the list that was passed to a method:\ndef try_to_change_list_contents(the_list):\n print 'got', the_list\n the_list.append('four')\n print 'changed to', the_list\n\nouter_list = ['one', 'two', 'three']\n\nprint 'before, outer_list =', outer_list\ntry_to_change_list_contents(outer_list)\nprint 'after, outer_list =', outer_list\n\nOutput:\nbefore, outer_list = ['one', 'two', 'three']\ngot ['one', 'two', 'three']\nchanged to ['one', 'two', 'three', 'four']\nafter, outer_list = ['one', 'two', 'three', 'four']\n\nSince the parameter passed in is a reference to outer_list, not a copy of it, we can use the mutating list methods to change it and have the changes reflected in the outer scope.\nNow let's see what happens when we try to change the reference that was passed in as a parameter:\ndef try_to_change_list_reference(the_list):\n print 'got', the_list\n the_list = ['and', 'we', 'can', 'not', 'lie']\n print 'set to', the_list\n\nouter_list = ['we', 'like', 'proper', 'English']\n\nprint 'before, outer_list =', outer_list\ntry_to_change_list_reference(outer_list)\nprint 'after, outer_list =', outer_list\n\nOutput:\nbefore, outer_list = ['we', 'like', 'proper', 'English']\ngot ['we', 'like', 'proper', 'English']\nset to ['and', 'we', 'can', 'not', 'lie']\nafter, outer_list = ['we', 'like', 'proper', 'English']\n\nSince the the_list parameter was passed by value, assigning a new list to it had no effect that the code outside the method could see. The the_list was a copy of the outer_list reference, and we had the_list point to a new list, but there was no way to change where outer_list pointed.\nString - an immutable type\nIt's immutable, so there's nothing we can do to change the contents of the string\nNow, let's try to change the reference\ndef try_to_change_string_reference(the_string):\n print 'got', the_string\n the_string = 'In a kingdom by the sea'\n print 'set to', the_string\n\nouter_string = 'It was many and many a year ago'\n\nprint 'before, outer_string =', outer_string\ntry_to_change_string_reference(outer_string)\nprint 'after, outer_string =', outer_string\n\nOutput:\nbefore, outer_string = It was many and many a year ago\ngot It was many and many a year ago\nset to In a kingdom by the sea\nafter, outer_string = It was many and many a year ago\n\nAgain, since the the_string parameter was passed by value, assigning a new string to it had no effect that the code outside the method could see. The the_string was a copy of the outer_string reference, and we had the_string point to a new string, but there was no way to change where outer_string pointed.\nI hope this clears things up a little.\nEDIT: It's been noted that this doesn't answer the question that @David originally asked, \"Is there something I can do to pass the variable by actual reference?\". Let's work on that.\nHow do we get around this?\nAs @Andrea's answer shows, you could return the new value. This doesn't change the way things are passed in, but does let you get the information you want back out:\ndef return_a_whole_new_string(the_string):\n new_string = something_to_do_with_the_old_string(the_string)\n return new_string\n\n# then you could call it like\nmy_string = return_a_whole_new_string(my_string)\n\nIf you really wanted to avoid using a return value, you could create a class to hold your value and pass it into the function or use an existing class, like a list:\ndef use_a_wrapper_to_simulate_pass_by_reference(stuff_to_change):\n new_string = something_to_do_with_the_old_string(stuff_to_change[0])\n stuff_to_change[0] = new_string\n\n# then you could call it like\nwrapper = [my_string]\nuse_a_wrapper_to_simulate_pass_by_reference(wrapper)\n\ndo_something_with(wrapper[0])\n\nAlthough this seems a little cumbersome.\n"} {"input": "Difference between __str__ and __repr__ in Python What is the difference between __str__ and __repr__ in Python?\n", "output": "Alex summarized well but, surprisingly, was too succinct.\nFirst, let me reiterate the main points in Alex\u00e2\u0080\u0099s post:\n\nThe default implementation is useless (it\u00e2\u0080\u0099s hard to think of one which wouldn\u00e2\u0080\u0099t be, but yeah)\n__repr__ goal is to be unambiguous\n__str__ goal is to be readable\nContainer\u00e2\u0080\u0099s __str__ uses contained objects\u00e2\u0080\u0099 __repr__\n\nDefault implementation is useless\nThis is mostly a surprise because Python\u00e2\u0080\u0099s defaults tend to be fairly useful. However, in this case, having a default for __repr__ which would act like:\nreturn \"%s(%r)\" % (self.__class__, self.__dict__)\n\nwould have been too dangerous (for example, too easy to get into infinite recursion if objects reference each other). So Python cops out. Note that there is one default which is true: if __repr__ is defined, and __str__ is not, the object will behave as though __str__=__repr__.\nThis means, in simple terms: almost every object you implement should have a functional __repr__ that\u00e2\u0080\u0099s usable for understanding the object. Implementing __str__ is optional: do that if you need a \u00e2\u0080\u009cpretty print\u00e2\u0080\u009d functionality (for example, used by a report generator).\nThe goal of __repr__ is to be unambiguous\nLet me come right out and say it \u00e2\u0080\u0094 I do not believe in debuggers. I don\u00e2\u0080\u0099t really know how to use any debugger, and have never used one seriously. Furthermore, I believe that the big fault in debuggers is their basic nature \u00e2\u0080\u0094 most failures I debug happened a long long time ago, in a galaxy far far away. This means that I do believe, with religious fervor, in logging. Logging is the lifeblood of any decent fire-and-forget server system. Python makes it easy to log: with maybe some project specific wrappers, all you need is a\nlog(INFO, \"I am in the weird function and a is\", a, \"and b is\", b, \"but I got a null C \u00e2\u0080\u0094 using default\", default_c)\n\nBut you have to do the last step \u00e2\u0080\u0094 make sure every object you implement has a useful repr, so code like that can just work. This is why the \u00e2\u0080\u009ceval\u00e2\u0080\u009d thing comes up: if you have enough information so eval(repr(c))==c, that means you know everything there is to know about c. If that\u00e2\u0080\u0099s easy enough, at least in a fuzzy way, do it. If not, make sure you have enough information about c anyway. I usually use an eval-like format: \"MyClass(this=%r,that=%r)\" % (self.this,self.that). It does not mean that you can actually construct MyClass, or that those are the right constructor arguments \u00e2\u0080\u0094 but it is a useful form to express \u00e2\u0080\u009cthis is everything you need to know about this instance\u00e2\u0080\u009d.\nNote: I used %r above, not %s. You always want to use repr() [or %r formatting character, equivalently] inside __repr__ implementation, or you\u00e2\u0080\u0099re defeating the goal of repr. You want to be able to differentiate MyClass(3) and MyClass(\"3\").\nThe goal of __str__ is to be readable\nSpecifically, it is not intended to be unambiguous \u00e2\u0080\u0094 notice that str(3)==str(\"3\"). Likewise, if you implement an IP abstraction, having the str of it look like 192.168.1.1 is just fine. When implementing a date/time abstraction, the str can be \"2010/4/12 15:35:22\", etc. The goal is to represent it in a way that a user, not a programmer, would want to read it. Chop off useless digits, pretend to be some other class \u00e2\u0080\u0094 as long is it supports readability, it is an improvement.\nContainer\u00e2\u0080\u0099s __str__ uses contained objects\u00e2\u0080\u0099 __repr__\nThis seems surprising, doesn\u00e2\u0080\u0099t it? It is a little, but how readable would\n[moshe is, 3, hello\nworld, this is a list, oh I don't know, containing just 4 elements]\n\nbe? Not very. Specifically, the strings in a container would find it way too easy to disturb its string representation. In the face of ambiguity, remember, Python resists the temptation to guess. If you want the above behavior when you\u00e2\u0080\u0099re printing a list, just\nprint \"[\"+\", \".join(l)+\"]\"\n\n(you can probably also figure out what to do about dictionaries.\nSummary\nImplement __repr__ for any class you implement. This should be second nature. Implement __str__ if you think it would be useful to have a string version which errs on the side of more readability in favor of more ambiguity.\n"} {"input": "Is there a way to run Python on Android? We are working on an S60 version and this platform has a nice Python API. \nHowever, there is nothing official about Python on Android, but since Jython exists, is there a way to let the snake and the robot work together?\n", "output": "One way is to use Kivy:\n\nOpen source Python library for rapid development of applications\n that make use of innovative user interfaces, such as multi-touch apps.\n\n\n\nKivy runs on Linux, Windows, OS X, Android and iOS. You can run the same [python] code on all supported platforms.\n\nKivy Showcase app \n"} {"input": "Explain Python's slice notation I need a good explanation (references are a plus) on Python's slice notation. \nTo me, this notation needs a bit of picking up. \nIt looks extremely powerful, but I haven't quite got my head around it.\n", "output": "It's pretty simple really:\na[start:end] # items start through end-1\na[start:] # items start through the rest of the array\na[:end] # items from the beginning through end-1\na[:] # a copy of the whole array\n\nThere is also the step value, which can be used with any of the above:\na[start:end:step] # start through not past end, by step\n\nThe key point to remember is that the :end value represents the first value that is not in the selected slice. So, the difference beween end and start is the number of elements selected (if step is 1, the default).\nThe other feature is that start or end may be a negative number, which means it counts from the end of the array instead of the beginning. So:\na[-1] # last item in the array\na[-2:] # last two items in the array\na[:-2] # everything except the last two items\n\nPython is kind to the programmer if there are fewer items than you ask for. For example, if you ask for a[:-2] and a only contains one element, you get an empty list instead of an error. Sometimes you would prefer the error, so you have to be aware that this may happen.\n"} {"input": "Finding the index of an item given a list containing it in Python For a list [\"foo\", \"bar\", \"baz\"] and an item in the list \"bar\", what's the cleanest way to get its index (1) in Python?\n", "output": ">>> [\"foo\", \"bar\", \"baz\"].index(\"bar\")\n1\n\nReference: Data Structures > More on Lists\n"} {"input": "Catch multiple exceptions in one line (except block) I know that I can do:\ntry:\n # do something that may fail\nexcept:\n # do this if ANYTHING goes wrong\n\nI can also do this:\ntry:\n # do something that may fail\nexcept IDontLikeYourFaceException:\n # put on makeup or smile\nexcept YouAreTooShortException:\n # stand on a ladder\n\nBut if I want to do the same thing inside two different exceptions, the best I can think of right now is to do this:\ntry:\n # do something that may fail\nexcept IDontLikeYouException:\n # say please\nexcept YouAreBeingMeanException:\n # say please\n\nIs there any way that I can do something like this (since the action to take in both exceptions is to say please):\ntry:\n # do something that may fail\nexcept IDontLikeYouException, YouAreBeingMeanException:\n # say please\n\nNow this really won't work, as it matches the syntax for:\ntry:\n # do something that may fail\nexcept Exception, e:\n # say please\n\nSo, my effort to catch the two distinct exceptions doesn't exactly come through.\nIs there a way to do this?\n", "output": "From https://docs.python.org/2/tutorial/errors.html#handling-exceptions: \"An except clause may name multiple exceptions as a parenthesized tuple, for example\":\nexcept (IDontLikeYouException, YouAreBeingMeanException) as e:\n pass\n\nSeparating the exception from the variable with a comma will still work in Python 2.6 and 2.7, but is now deprecated and does not work in Python 3; now you should be using as.\n"} {"input": "Check if a given key already exists in a dictionary I wanted to test if a key exists in a dictionary before updating the value for the key.\nI wrote the following code:\nif 'key1' in dict.keys():\n print \"blah\"\nelse:\n print \"boo\"\n\nI think this is not the best way to accomplish this task. Is there a better way to test for a key in the dictionary?\n", "output": "in is the intended way to test for the existence of a key in a dict.\nd = dict()\n\nfor i in xrange(100):\n key = i % 10\n if key in d:\n d[key] += 1\n else:\n d[key] = 1\n\nIf you wanted a default, you can always use dict.get():\nd = dict()\n\nfor i in xrange(100):\n key = i % 10\n d[key] = d.get(key, 0) + 1\n\n... and if you wanted to always ensure a default value for any key you can use defaultdict from the collections module, like so:\nfrom collections import defaultdict\n\nd = defaultdict(lambda: 0)\n\nfor i in xrange(100):\n d[i % 10] += 1\n\n... but in general, the in keyword is the best way to do it.\n"} {"input": "Add key to a dictionary in Python? Is it possible to add a key to a Python dictionary after it has been created? It doesn't seem to have an .add() method.\n", "output": ">>> d = {'key':'value'}\n>>> print d\n{'key': 'value'}\n>>> d['mynewkey'] = 'mynewvalue'\n>>> print d\n{'mynewkey': 'mynewvalue', 'key': 'value'}\n\n"} {"input": "How to get current time in Python What is the module/method used to get current time?\n", "output": ">>> import datetime\n>>> datetime.datetime.now()\ndatetime(2009, 1, 6, 15, 8, 24, 78915)\n\nAnd just the time:\n>>> datetime.datetime.time(datetime.datetime.now())\ndatetime.time(15, 8, 24, 78915)\n\nThe same but slightly more compact:\n>>> datetime.datetime.now().time()\n\nSee the documentation for more info.\nTo save typing, you can import the datetime object from the datetime module:\n>>> from datetime import datetime\n\nThen remove the leading datetime. from all the above.\n"} {"input": "How can I represent an 'Enum' in Python? I'm mainly a C# developer, but I'm currently working on a project in Python.\nHow can I represent the equivalent of an Enum in Python? \n", "output": "Enums have been added to Python 3.4 as described in PEP 435. It has also been backported to 3.3, 3.2, 3.1, 2.7, 2.6, 2.5, and 2.4 on pypi. \nFor more advanced Enum techniques try the aenum library (2.7, 3.3+, same author as enum34. Code is not perfectly compatible between py2 and py3, e.g. you'll need __order__ in python 2).\n\nTo use enum34, do $ pip install enum34\nTo use aenum, do $ pip install aenum\n\nInstalling enum (no numbers) will install a completely different and incompatible version.\n\nfrom enum import Enum # for enum34, or the stdlib version\n# from aenum import Enum # for the aenum version\nAnimal = Enum('Animal', 'ant bee cat dog')\n\nAnimal.ant # returns \nAnimal['ant'] # returns (string lookup)\nAnimal.ant.name # returns 'ant' (inverse lookup)\n\nor equivalently:\nclass Animal(Enum):\n ant = 1\n bee = 2\n cat = 3\n dog = 4\n\n\nIn earlier versions, one way of accomplishing enums is:\ndef enum(**enums):\n return type('Enum', (), enums)\n\nwhich is used like so:\n>>> Numbers = enum(ONE=1, TWO=2, THREE='three')\n>>> Numbers.ONE\n1\n>>> Numbers.TWO\n2\n>>> Numbers.THREE\n'three'\n\nYou can also easily support automatic enumeration with something like this:\ndef enum(*sequential, **named):\n enums = dict(zip(sequential, range(len(sequential))), **named)\n return type('Enum', (), enums)\n\nand used like so:\n>>> Numbers = enum('ZERO', 'ONE', 'TWO')\n>>> Numbers.ZERO\n0\n>>> Numbers.ONE\n1\n\nSupport for converting the values back to names can be added this way:\ndef enum(*sequential, **named):\n enums = dict(zip(sequential, range(len(sequential))), **named)\n reverse = dict((value, key) for key, value in enums.iteritems())\n enums['reverse_mapping'] = reverse\n return type('Enum', (), enums)\n\nThis overwrites anything with that name, but it is useful for rendering your enums in output. It will throw KeyError if the reverse mapping doesn't exist. With the first example:\n>>> Numbers.reverse_mapping['three']\n'THREE'\n\n"} {"input": "Parse String to Float or Int In Python, how can I parse a numeric string like \"545.2222\" to its corresponding float value, 542.2222? Or parse the string \"31\" to an integer, 31?\nI just want to know how to parse a float string to a float, and (separately) an int string to an int.\n", "output": ">>> a = \"545.2222\"\n>>> float(a)\n545.22220000000004\n>>> int(float(a))\n545\n\n"} {"input": "How can I make a time delay in Python? I would like to know how to put a time delay in a Python script.\n", "output": "import time\ntime.sleep(5) # delays for 5 seconds\n\nHere is another example where something is run once a minute:\nimport time \nwhile True:\n print \"This prints once a minute.\"\n time.sleep(60) # Delay for 1 minute (60 seconds)\n\n"} {"input": "Iterating over dictionaries using for loops in Python I am a bit puzzled by the following code:\nd = {'x': 1, 'y': 2, 'z': 3} \nfor key in d:\n print key, 'corresponds to', d[key]\n\nWhat I don't understand is the key portion. How does Python recognize that it needs only to read the key from the dictionary? Is key a special word in Python? Or is it simply a variable?\n", "output": "key is just a variable name. \nfor key in d:\n\nwill simply loop over the keys in the dictionary, rather than the keys and values. To loop over both key and value you can use the following:\nFor Python 2.x:\nfor key, value in d.iteritems():\n\nFor Python 3.x:\nfor key, value in d.items():\n\nTo test for yourself, change the word key to poop.\nFor Python 3.x, iteritems() has been replaced with simply items(), which returns a set-like view backed by the dict, like iteritems() but even better. \nThis is also available in 2.7 as viewitems(). \nThe operation items() will work for both 2 and 3, but in 2 it will return a list of the dictionary's (key, value) pairs, which will not reflect changes to the dict that happen after the items() call. If you want the 2.x behavior in 3.x, you can call list(d.items()).\n"} {"input": "Making a flat list out of list of lists in Python I wonder whether there is a shortcut to make a simple list out of list of lists in Python.\nI can do that in a for loop, but maybe there is some cool \"one-liner\"? I tried it with reduce, but I get an error.\nCode\nl = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]\nreduce(lambda x, y: x.extend(y), l)\n\nError message\nTraceback (most recent call last):\n File \"\", line 1, in \n File \"\", line 1, in \nAttributeError: 'NoneType' object has no attribute 'extend'\n\n", "output": "[item for sublist in l for item in sublist]\n\nis faster than the shortcuts posted so far. (l is the list to flatten.)\nFor evidence, as always, you can use the timeit module in the standard library:\n$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'\n10000 loops, best of 3: 143 usec per loop\n$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'\n1000 loops, best of 3: 969 usec per loop\n$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'reduce(lambda x,y: x+y,l)'\n1000 loops, best of 3: 1.1 msec per loop\n\nExplanation: the shortcuts based on + (including the implied use in sum) are, of necessity, O(L**2) when there are L sublists -- as the intermediate result list keeps getting longer, at each step a new intermediate result list object gets allocated, and all the items in the previous intermediate result must be copied over (as well as a few new ones added at the end). So (for simplicity and without actual loss of generality) say you have L sublists of I items each: the first I items are copied back and forth L-1 times, the second I items L-2 times, and so on; total number of copies is I times the sum of x for x from 1 to L excluded, i.e., I * (L**2)/2.\nThe list comprehension just generates one list, once, and copies each item over (from its original place of residence to the result list) also exactly once.\n"} {"input": "Print in terminal with colors using Python? How can I output colored text to the terminal, in Python?\nWhat is the best Unicode symbol to represent a solid block?\n", "output": "This somewhat depends on what platform you are on. The most common way to do this is by printing ANSI escape sequences. For a simple example, here's some python code from the blender build scripts:\nclass bcolors:\n HEADER = '\\033[95m'\n OKBLUE = '\\033[94m'\n OKGREEN = '\\033[92m'\n WARNING = '\\033[93m'\n FAIL = '\\033[91m'\n ENDC = '\\033[0m'\n BOLD = '\\033[1m'\n UNDERLINE = '\\033[4m'\n\nTo use code like this, you can do something like \nprint bcolors.WARNING + \"Warning: No active frommets remain. Continue?\" \n + bcolors.ENDC\n\nThis will work on unixes including OS X, linux and windows (provided you use ANSICON, or in Windows 10 provided you enable VT100 emulation). There are ansi codes for setting the color, moving the cursor, and more.\nIf you are going to get complicated with this (and it sounds like you are if you are writing a game), you should look into the \"curses\" module, which handles a lot of the complicated parts of this for you. The Python Curses HowTO is a good introduction.\nIf you are not using extended ASCII (i.e. not on a PC), you are stuck with the ascii characters below 127, and '#' or '@' is probably your best bet for a block. If you can ensure your terminal is using a IBM extended ascii character set, you have many more options. Characters 176, 177, 178 and 219 are the \"block characters\".\nSome modern text-based programs, such as \"Dwarf Fortress\", emulate text mode in a graphical mode, and use images of the classic PC font. You can find some of these bitmaps that you can use on the Dwarf Fortress Wiki see (user-made tilesets).\nThe Text Mode Demo Contest has more resources for doing graphics in text mode.\nHmm.. I think got a little carried away on this answer. I am in the midst of planning an epic text-based adventure game, though. Good luck with your colored text!\n"} {"input": "Static class variables in Python Is it possible to have static class variables or methods in python? What syntax is required to do this?\n", "output": "Variables declared inside the class definition, but not inside a method are class or static variables:\n>>> class MyClass:\n... i = 3\n...\n>>> MyClass.i\n3 \n\nAs @millerdev points out, this creates a class-level \"i\" variable, but this is distinct from any instance-level \"i\" variable, so you could have\n>>> m = MyClass()\n>>> m.i = 4\n>>> MyClass.i, m.i\n>>> (3, 4)\n\nThis is different from C++ and Java, but not so different from C#, where a static member can't be accessed using a reference to an instance.\nSee what the Python tutorial has to say on the subject of classes and class objects.\n@Steve Johnson has already answered regarding static methods, also documented under \"Built-in Functions\" in the Python Library Reference.\nclass C:\n @staticmethod\n def f(arg1, arg2, ...): ...\n\n@beidy recommends classmethods over staticmethod, as the method then receives the class type as the first argument, but I'm still a little fuzzy on the advantages of this approach over staticmethod. If you are too, then it probably doesn't matter.\n"} {"input": "How to get the size of a list items = []\nitems.append(\"apple\")\nitems.append(\"orange\")\nitems.append(\"banana\")\n\n# FAKE METHOD::\nitems.amount() # Should return 3\n\nHow do I do it right?\n", "output": "The len() function can be used with a lot of types in Python - both built-in types and library types.\n>>> len([1,2,3])\n3\n\n"} {"input": "Is there a way to substring a string in Python? Is there a way to substring a string in Python, to get a new string from the 3rd character to the end of the string?\nMaybe like myString[2:end]?\nIf leaving the second part means 'till the end', if you leave the first part, does it start from the start?\n", "output": ">>> x = \"Hello World!\"\n>>> x[2:]\n'llo World!'\n>>> x[:2]\n'He'\n>>> x[:-2]\n'Hello Worl'\n>>> x[-2:]\n'd!'\n>>> x[2:-2]\n'llo Worl'\n\nPython calls this concept \"slicing\" and it works on more than just strings. Take a look here for a comprehensive introduction.\n"} {"input": "How do you split a list into evenly sized chunks? I have a list of arbitrary length, and I need to split it up into equal size chunks and operate on it. There are some obvious ways to do this, like keeping a counter and two lists, and when the second list fills up, add it to the first list and empty the second list for the next round of data, but this is potentially extremely expensive.\nI was wondering if anyone had a good solution to this for lists of any length, e.g. using generators.\nThis should work:\nl = range(1, 1000)\nprint chunks(l, 10) -> [ [ 1..10 ], [ 11..20 ], .., [ 991..999 ] ]\n\nI was looking for something useful in itertools but I couldn't find anything obviously useful. Might've missed it, though.\nRelated question: What is the most \u00e2\u0080\u009cpythonic\u00e2\u0080\u009d way to iterate over a list in chunks?\n", "output": "Here's a generator that yields the chunks you want:\ndef chunks(l, n):\n \"\"\"Yield successive n-sized chunks from l.\"\"\"\n for i in range(0, len(l), n):\n yield l[i:i + n]\n\n\nimport pprint\npprint.pprint(list(chunks(range(10, 75), 10)))\n[[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],\n [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],\n [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],\n [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],\n [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],\n [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],\n [70, 71, 72, 73, 74]]\n\n\nIf you're using Python 2, you should use xrange() instead of range():\ndef chunks(l, n):\n \"\"\"Yield successive n-sized chunks from l.\"\"\"\n for i in xrange(0, len(l), n):\n yield l[i:i + n]\n\n\nAlso you can simply use list comprehension instead of write a function. Python 3:\n[l[i:i + n] for i in range(0, len(l), n)]\n\nPython 2 version:\n[l[i:i + n] for i in xrange(0, len(l), n)]\n\n"} {"input": "How can I remove (chomp) a newline in Python? What is the Python equivalent of Perl's chomp function, which removes the last character of a value?\n", "output": "Try the rstrip method.\n>>> 'test string\\n'.rstrip()\n'test string'\n\nNote that Python's rstrip method strips all kinds of trailing whitespace by default, not just one newline as Perl does with chomp. To strip only newlines:\n>>> 'test string \\n\\n'.rstrip('\\n')\n'test string '\n\nThere is also the lstrip and strip methods.\n>>> s = \" \\n abc def \"\n>>> s.strip()\n'abc def'\n>>> s.rstrip()\n' \\n abc def'\n>>> s.lstrip()\n'abc def '\n>>>\n\n"} {"input": "Static methods in Python? Is it possible to have static methods in Python so I can call them without initializing a class, like:\nClassName.StaticMethod ( )\n\n", "output": "Yep, using the staticmethod decorator\nclass MyClass(object):\n @staticmethod\n def the_static_method(x):\n print x\n\nMyClass.the_static_method(2) # outputs 2\n\nNote that some code might use the old method of defining a static method, using staticmethod as a function rather than a decorator. This should only be used if you have to support ancient versions of Python (2.2 and 2.3)\nclass MyClass(object):\n def the_static_method(x):\n print x\n the_static_method = staticmethod(the_static_method)\n\nMyClass.the_static_method(2) # outputs 2\n\nThis is entirely identical to the first example (using @staticmethod), just not using the nice decorator syntax\nFinally, use staticmethod() sparingly! There are very few situations where static-methods are necessary in Python, and I've seen them used many times where a separate \"top-level\" function would have been clearer.\n\nThe following is verbatim from the documentation::\n\nA static method does not receive an implicit first argument. To declare a static method, use this idiom:\nclass C:\n @staticmethod\n def f(arg1, arg2, ...): ...\n\nThe @staticmethod form is a function decorator \u00e2\u0080\u0093 see the description of function definitions in Function definitions for details.\nIt can be called either on the class (such as C.f()) or on an instance (such as C().f()). The instance is ignored except for its class.\nStatic methods in Python are similar to those found in Java or C++. For a more advanced concept, see classmethod().\nFor more information on static methods, consult the documentation on the standard type hierarchy in The standard type hierarchy.\nNew in version 2.2.\nChanged in version 2.4: Function decorator syntax added.\n\n"} {"input": "How to clone or copy a list? What are the options to clone or copy a list in Python?\nUsing new_list = my_list then modifies new_list every time my_list changes.\nWhy is this?\n", "output": "You have various possibilities:\n\nYou can slice it: \nnew_list = old_list[:]\n\nAlex Martelli's opinion (at least back in 2007) about this is, that it is a weird syntax and it does not make sense to use it ever. ;) (In his opinion, the next one is more readable).\nYou can use the built in list() function:\nnew_list = list(old_list)\n\nYou can use generic copy.copy():\nimport copy\nnew_list = copy.copy(old_list)\n\nThis is a little slower than list() because it has to find out the datatype of old_list first.\nIf the list contains objects and you want to copy them as well, use generic copy.deepcopy():\nimport copy\nnew_list = copy.deepcopy(old_list)\n\nObviously the slowest and most memory-needing method, but sometimes unavoidable.\n\nExample:\nimport copy\n\nclass Foo(object):\n def __init__(self, val):\n self.val = val\n\n def __repr__(self):\n return str(self.val)\n\nfoo = Foo(1)\n\na = ['foo', foo]\nb = a[:]\nc = list(a)\nd = copy.copy(a)\ne = copy.deepcopy(a)\n\n# edit orignal list and instance \na.append('baz')\nfoo.val = 5\n\nprint('original: %r\\n slice: %r\\n list(): %r\\n copy: %r\\n deepcopy: %r'\n % (a, b, c, d, e))\n\nResult:\noriginal: ['foo', 5, 'baz']\nslice: ['foo', 5]\nlist(): ['foo', 5]\ncopy: ['foo', 5]\ndeepcopy: ['foo', 1]\n\n"} {"input": "How to append list to second list (concatenate lists) How do I concatenate two lists in Python?\nExample:\nlistone = [1, 2, 3]\nlisttwo = [4, 5, 6]\n\nExpected outcome:\njoinedlist == [1, 2, 3, 4, 5, 6]\n\n", "output": "Python makes this ridiculously easy.\nmergedlist = listone + listtwo\n\n"} {"input": "Python join, why is it string.join(list) instead of list.join(string)? This has always confused me. It seems like this would be nicer:\nmy_list = [\"Hello\", \"world\"]\nprint my_list.join(\"-\")\n# Produce: \"Hello-world\"\n\nThan this:\nmy_list = [\"Hello\", \"world\"]\nprint \"-\".join(my_list)\n# Produce: \"Hello-world\"\n\nIs there a specific reason it does it like this?\n", "output": "It's because any iterable can be joined, not just lists, but the result and the \"joiner\" are always strings.\nE.G:\nimport urllib2\nprint '\\n############\\n'.join(\n urllib2.urlopen('http://data.stackexchange.com/users/7095'))\n\n"} {"input": "Why is reading lines from stdin much slower in C++ than Python? I wanted to compare reading lines of string input from stdin using Python and C++ and was shocked to see my C++ code run an order of magnitude slower than the equivalent Python code. Since my C++ is rusty and I'm not yet an expert Pythonista, please tell me if I'm doing something wrong or if I'm misunderstanding something. \n\n(tl;dr answer: include the statement: cin.sync_with_stdio(false) or just use fgets instead.\ntl;dr results: scroll all the way down to the bottom of my question and look at the table.)\n\nC++ code:\n#include \n#include \n\nusing namespace std;\n\nint main() {\n string input_line;\n long line_count = 0;\n time_t start = time(NULL);\n int sec;\n int lps; \n\n while (cin) {\n getline(cin, input_line);\n if (!cin.eof())\n line_count++;\n };\n\n sec = (int) time(NULL) - start;\n cerr << \"Read \" << line_count << \" lines in \" << sec << \" seconds.\" ;\n if (sec > 0) {\n lps = line_count / sec;\n cerr << \" LPS: \" << lps << endl;\n } else\n cerr << endl;\n return 0;\n}\n\n//Compiled with:\n//g++ -O3 -o readline_test_cpp foo.cpp\n\nPython Equivalent:\n#!/usr/bin/env python\nimport time\nimport sys\n\ncount = 0\nstart = time.time()\n\nfor line in sys.stdin:\n count += 1\n\ndelta_sec = int(time.time() - start_time)\nif delta_sec >= 0:\n lines_per_sec = int(round(count/delta_sec))\n print(\"Read {0} lines in {1} seconds. LPS: {2}\".format(count, delta_sec,\n lines_per_sec))\n\nHere are my results:\n$ cat test_lines | ./readline_test_cpp \nRead 5570000 lines in 9 seconds. LPS: 618889\n\n$cat test_lines | ./readline_test.py \nRead 5570000 lines in 1 seconds. LPS: 5570000\n\nEdit: I should note that I tried this both under OS-X (10.6.8) and Linux 2.6.32 (RHEL 6.2). The former is a macbook pro, the latter is a very beefy server, not that this is too pertinent.\nEdit 2: (Removed this edit, as no longer applicable)\n$ for i in {1..5}; do echo \"Test run $i at `date`\"; echo -n \"CPP:\"; cat test_lines | ./readline_test_cpp ; echo -n \"Python:\"; cat test_lines | ./readline_test.py ; done\nTest run 1 at Mon Feb 20 21:29:28 EST 2012\nCPP: Read 5570001 lines in 9 seconds. LPS: 618889\nPython:Read 5570000 lines in 1 seconds. LPS: 5570000\nTest run 2 at Mon Feb 20 21:29:39 EST 2012\nCPP: Read 5570001 lines in 9 seconds. LPS: 618889\nPython:Read 5570000 lines in 1 seconds. LPS: 5570000\nTest run 3 at Mon Feb 20 21:29:50 EST 2012\nCPP: Read 5570001 lines in 9 seconds. LPS: 618889\nPython:Read 5570000 lines in 1 seconds. LPS: 5570000\nTest run 4 at Mon Feb 20 21:30:01 EST 2012\nCPP: Read 5570001 lines in 9 seconds. LPS: 618889\nPython:Read 5570000 lines in 1 seconds. LPS: 5570000\nTest run 5 at Mon Feb 20 21:30:11 EST 2012\nCPP: Read 5570001 lines in 10 seconds. LPS: 557000\nPython:Read 5570000 lines in 1 seconds. LPS: 5570000\n\nEdit 3: \nOkay, I tried J.N.'s suggestion of trying having python store the line read: but it made no difference to python's speed. \nI also tried J.N.'s suggestion of using scanf into a char array instead of getline into a std::string. Bingo! This resulted in equivalent performance for both python and c++. (3,333,333 LPS with my input data, which by the way are just short lines of three fields each, usually about 20 chars wide, though sometimes more).\nCode:\nchar input_a[512];\nchar input_b[32];\nchar input_c[512];\nwhile(scanf(\"%s %s %s\\n\", input_a, input_b, input_c) != EOF) { \n line_count++;\n};\n\nSpeed:\n$ cat test_lines | ./readline_test_cpp2 \nRead 10000000 lines in 3 seconds. LPS: 3333333\n$ cat test_lines | ./readline_test2.py \nRead 10000000 lines in 3 seconds. LPS: 3333333\n\n(Yes, I ran it several times.) So, I guess I will now use scanf instead of getline. But, I'm still curious if people think this performance hit from std::string/getline is typical and reasonable. \nEdit 4 (was: Final Edit / Solution):\nAdding:\n cin.sync_with_stdio(false);\nImmediately above my original while loop above results in code that runs faster than Python. \nNew performance comparison (this is on my 2011 Macbook Pro), using the original code, the original with the sync disabled, and the original python, respectively, on a file with 20M lines of text. Yes, I ran it several times to eliminate disk caching confound.\n$ /usr/bin/time cat test_lines_double | ./readline_test_cpp\n 33.30 real 0.04 user 0.74 sys\nRead 20000001 lines in 33 seconds. LPS: 606060\n$ /usr/bin/time cat test_lines_double | ./readline_test_cpp1b\n 3.79 real 0.01 user 0.50 sys\nRead 20000000 lines in 4 seconds. LPS: 5000000\n$ /usr/bin/time cat test_lines_double | ./readline_test.py \n 6.88 real 0.01 user 0.38 sys\nRead 20000000 lines in 6 seconds. LPS: 3333333\n\nThanks to @Vaughn Cato for his answer! Any elaboration people can make or good references people can point to as to why this sync happens, what it means, when it's useful, and when it's okay to disable would be greatly appreciated by posterity. :-)\nEdit 5 / Better Solution:\nAs suggested by Gandalf The Gray below, gets is even faster than scanf or the unsynchronized cin approach. I also learned that scanf and gets are both UNSAFE and should NOT BE USED due to potential of buffer overflow. So, I wrote this iteration using fgets, the safer alternative to gets. Here are the pertinent lines for my fellow noobs:\nchar input_line[MAX_LINE];\nchar *result;\n\n//\n\nwhile((result = fgets(input_line, MAX_LINE, stdin )) != NULL) \n line_count++;\nif (ferror(stdin))\n perror(\"Error reading stdin.\");\n\nNow, here are the results using an even larger file (100M lines; ~3.4GB) on a fast server with very fast disk, comparing the python, the unsynced cin, and the fgets approaches, as well as comparing with the wc utility. [The scanf version segfaulted and I don't feel like troubleshooting it.]:\n$ /usr/bin/time cat temp_big_file | readline_test.py \n0.03user 2.04system 0:28.06elapsed 7%CPU (0avgtext+0avgdata 2464maxresident)k\n0inputs+0outputs (0major+182minor)pagefaults 0swaps\nRead 100000000 lines in 28 seconds. LPS: 3571428\n\n$ /usr/bin/time cat temp_big_file | readline_test_unsync_cin \n0.03user 1.64system 0:08.10elapsed 20%CPU (0avgtext+0avgdata 2464maxresident)k\n0inputs+0outputs (0major+182minor)pagefaults 0swaps\nRead 100000000 lines in 8 seconds. LPS: 12500000\n\n$ /usr/bin/time cat temp_big_file | readline_test_fgets \n0.00user 0.93system 0:07.01elapsed 13%CPU (0avgtext+0avgdata 2448maxresident)k\n0inputs+0outputs (0major+181minor)pagefaults 0swaps\nRead 100000000 lines in 7 seconds. LPS: 14285714\n\n$ /usr/bin/time cat temp_big_file | wc -l\n0.01user 1.34system 0:01.83elapsed 74%CPU (0avgtext+0avgdata 2464maxresident)k\n0inputs+0outputs (0major+182minor)pagefaults 0swaps\n100000000\n\n\nRecap (lines per second):\npython: 3,571,428 \ncin (no sync): 12,500,000\nfgets: 14,285,714\nwc: 54,644,808\n\nAs you can see, fgets is better but still pretty far from wc performance; I'm pretty sure this is due to the fact that wc examines each character without any memory copying. I suspect that, at this point, other parts of the code will become the bottleneck, so I don't think optimizing to that level would even be worthwhile, even if possible (since, after all, I actually need to store the read lines in memory). \nAlso note that a small tradeoff with using a char * buffer and fgets vs unsynced cin to string is that the latter can read lines of any length, while the former requires limiting input to some finite number. In practice, this is probably a non-issue for reading most line-based input files, as the buffer can be set to a very large value that would not be exceeded by valid input. \nThis has been educational. Thanks to all for your comments and suggestions.\nEdit 6:\nAs suggested by J.F. Sebastian in the comments below, the GNU wc utility uses plain C read() (within the safe-read.c wrapper) to read chunks (of 16k bytes) at a time and count new lines. Here's a python equivalent based on J.F.'s code (just showing the relevant snippet that replaces the python for loop:\nBUFFER_SIZE = 16384 \ncount = sum(chunk.count('\\n') for chunk in iter(partial(sys.stdin.read, BUFFER_SIZE), ''))\n\nThe performance of this version is quite fast (though still a bit slower than the raw c wc utility, of course:\n$ /usr/bin/time cat temp_big_file | readline_test3.py \n0.01user 1.16system 0:04.74elapsed 24%CPU (0avgtext+0avgdata 2448maxresident)k\n0inputs+0outputs (0major+181minor)pagefaults 0swaps\nRead 100000000 lines in 4.7275 seconds. LPS: 21152829\n\nAgain, it's a bit silly for me to compare C++ fgets/cin and the first python code on the one hand to wc -l and this last python snippet on the other, as the latter two don't actually store the read lines but merely count newlines. Still, it's interesting to explore all the different implementations and think about the performance implications. Thanks again!\nEdit 7: Tiny benchmark addendum and recap\nFor completeness, I thought I'd update the read speed for the same file on the same box with the original (synced) C++ code. Again, this is for a 100M line file on a fast disk. Here's the complete table now:\nImplementation Lines per second\npython (default) 3,571,428\ncin (default/naive) 819,672\ncin (no sync) 12,500,000\nfgets 14,285,714\nwc (not fair comparison) 54,644,808\n\n", "output": "By default, cin is synchronized with stdio, which causes it to avoid any input buffering. If you add this to the top of your main, you should see much better performance:\nstd::ios_base::sync_with_stdio(false);\n\nNormally, when an input stream is buffered, instead of reading one character at a time, the stream will be read in larger chunks. This reduces the number of system calls, which are typically relatively expensive. However, since the FILE* based stdio and iostreams often have separate implementations and therefore separate buffers, this could lead to a problem if both were used together. For example:\nint myvalue1;\ncin >> myvalue1;\nint myvalue2;\nscanf(\"%d\",&myvalue2);\n\nIf more input was read by cin than it actually needed, then the second integer value wouldn't be available for the scanf function, which has its own independent buffer. This would lead to unexpected results.\nTo avoid this, by default, streams are synchronized with stdio. One common way to achieve this is to have cin read each character one at a time as needed using stdio functions. Unfortunately, this introduces a lot of overhead. For small amounts of input, this isn't a big problem, but when you are reading millions of lines, the performance penalty is significant.\nFortunately, the library designers decided that you should also be able to disable this feature to get improved performance if you knew what you were doing, so they provided the sync_with_stdio method.\n"} {"input": "How to read a file line by line into a list with Python How do I read every line of a file in Python and store each line as an element in list? \nI want to read the file line by line and each line is appended to the end of the list.\n", "output": "with open(fname) as f:\n content = f.readlines()\n\nI'm guessing that you meant list and not array.\n"} {"input": "Installing pip on Mac OS X I spent most of the day yesterday searching for a clear answer for installing pip. I can't find a good solution. \nCan somebody help me install it?\n", "output": "All you need to do is\nsudo easy_install pip\n\n"} {"input": "How do I check if a string is a number (float) in Python? What is the best possible way to check if a string can be represented as a number in Python? \nThe function I currently have right now is:\ndef is_number(s):\n try:\n float(s)\n return True\n except ValueError:\n return False\n\nWhich, not only is ugly and slow, seems clunky. However I haven't found a better method because calling float in the main function is even worse. \n", "output": "In case you are looking for parsing (positive, unsigned) integers instead of floats, you can use the isdigit() function for string objects.\n>>> a = \"03523\"\n>>> a.isdigit()\nTrue\n>>> b = \"963spam\"\n>>> b.isdigit()\nFalse\n\nString Methods - isdigit()\nThere's also something on Unicode strings, which I'm not too familiar with\nUnicode - Is decimal/decimal\n"} {"input": "Upgrading all packages with pip Is it possible to upgrade all Python packages at one time with pip?\nNote that there is a feature request for this on the official issue tracker.\n", "output": "There isn't a built-in flag yet, but you can use\npip freeze --local | grep -v '^\\-e' | cut -d = -f 1 | xargs -n1 pip install -U\n\nNote: there are infinite potential variations for this. I'm trying to keep this answer short and simple, but please do suggest variations in the comments!\nRelevant edits:\n\nAdded a grep to skip \"-e\" package definitions, as suggested by @jawache (Yes, you could replace grep+cut with sed or awk or Perl or...).\nNewer versions of pip allow you to list outdated packages, though not in \"requirements\" format: pip list --outdated\nAdded -n1 to xargs, prevents stopping everything if updating one package fails (thanks @andsens)\n\n"} {"input": "How do you read from stdin in Python? I'm trying to do some of the code golf challenges, but they all require the input to be taken from stdin. How do I get that in Python?\n", "output": "You could use the fileinput module:\nimport fileinput\n\nfor line in fileinput.input():\n pass\n\nfileinput will loop through all the lines in the input specified as file names given in command-line arguments, or the standard input if no arguments are provided.\n"} {"input": "Python string formatting: % vs. .format Python 2.6 introduced the str.format() method with a slightly different syntax from the existing % operator. Which is better and for what situations?\n\nThe following uses each method and has the same outcome, so what is the difference?\n#!/usr/bin/python\nsub1 = \"python string!\"\nsub2 = \"an arg\"\n\na = \"i am a %s\" % sub1\nb = \"i am a {0}\".format(sub1)\n\nc = \"with %(kwarg)s!\" % {'kwarg':sub2}\nd = \"with {kwarg}!\".format(kwarg=sub2)\n\nprint a # \"i am a python string!\"\nprint b # \"i am a python string!\"\nprint c # \"with an arg!\"\nprint d # \"with an arg!\"\n\nFurthermore when does string formatting occur in Python? For example, if my logging level is set to HIGH will I still take a hit for performing the following % operation? And if so, is there a way to avoid this?\nlog.debug(\"some debug info: %s\" % some_info)\n\n\n", "output": "To answer your first question... .format just seems more sophisticated in many ways. An annoying thing about % is also how it can either take a variable or a tuple. You'd think the following would always work:\n\"hi there %s\" % name\n\nyet, if name happens to be (1, 2, 3), it will throw a TypeError. To guarantee that it always prints, you'd need to do\n\"hi there %s\" % (name,) # supply the single argument as a single-item tuple\n\nwhich is just ugly. .format doesn't have those issues. Also in the second example you gave, the .format example is much cleaner looking.\nWhy would you not use it? \n\nnot knowing about it (me before reading this)\nhaving to be compatible with Python 2.5\n\n\nTo answer your second question, string formatting happens at the same time as any other operation - when the string formatting expression is evaluated. And Python, not being a lazy language, evaluates expressions before calling functions, so in your log.debug example, the expression \"some debug info: %s\"%some_infowill first evaluate to, e.g. \"some debug info: roflcopters are active\", then that string will be passed to log.debug(). \n"} {"input": "Determine the type of a Python object Is there a simple way to determine if a variable is a list, dictionary, or something else? I am getting an object back that may be either type and I need to be able to tell the difference.\n", "output": "To get the type of an object, you can use the built-in type() function. Passing an object as the only parameter will return the type object of that object:\n>>> type([]) is list\nTrue\n>>> type({}) is dict\nTrue\n>>> type('') is str\nTrue\n>>> type(0) is int\nTrue\n>>> type({})\n\n>>> type([])\n\n\nThis of course also works for custom types:\n>>> class Test1 (object):\n pass\n>>> class Test2 (Test1):\n pass\n>>> a = Test1()\n>>> b = Test2()\n>>> type(a) is Test1\nTrue\n>>> type(b) is Test2\nTrue\n\nNote that type() will only return the immediate type of the object, but won\u00e2\u0080\u0099t be able to tell you about type inheritance.\n>>> type(b) is Test1\nFalse\n\nTo cover that, you should use the isinstance function. This of course also works for built-in types:\n>>> isinstance(b, Test1)\nTrue\n>>> isinstance(b, Test2)\nTrue\n>>> isinstance(a, Test1)\nTrue\n>>> isinstance(a, Test2)\nFalse\n>>> isinstance([], list)\nTrue\n>>> isinstance({}, dict)\nTrue\n\nisinstance() is usually the preferred way to ensure the type of an object because it will also accept derived types. So unless you actually need the type object (for whatever reason), using isinstance() is preferred over type().\nThe second parameter of isinstance() also accepts a tuple of types, so it\u00e2\u0080\u0099s possible to check for multiple types at once. isinstance will then return true, if the object is of any of those types:\n>>> isinstance([], (tuple, list, set))\nTrue\n\n"} {"input": "What does ** (double star) and * (star) do for Python parameters? In the following method definitions, what does the * and ** do for param2?\ndef foo(param1, *param2):\ndef bar(param1, **param2):\n\n", "output": "The *args and **kwargs is a common idiom to allow arbitrary number of arguments to functions as described in the section more on defining functions in the Python documentation.\nThe *args will give you all function parameters as a tuple:\nIn [1]: def foo(*args):\n ...: for a in args:\n ...: print a\n ...: \n ...: \n\nIn [2]: foo(1)\n1\n\n\nIn [4]: foo(1,2,3)\n1\n2\n3\n\nThe **kwargs will give you all \nkeyword arguments except for those corresponding to a formal parameter as a dictionary.\nIn [5]: def bar(**kwargs):\n ...: for a in kwargs:\n ...: print a, kwargs[a]\n ...: \n ...: \n\nIn [6]: bar(name='one', age=27)\nage 27\nname one\n\nBoth idioms can be mixed with normal arguments to allow a set of fixed and some variable arguments:\ndef foo(kind, *args, **kwargs):\n pass\n\nAnother usage of the *l idiom is to unpack argument lists when calling a function.\nIn [9]: def foo(bar, lee):\n ...: print bar, lee\n ...: \n ...: \n\nIn [10]: l = [1,2]\n\nIn [11]: foo(*l)\n1 2\n\nIn Python 3 it is possible to use *l on the left side of an assignment (Extended Iterable Unpacking):\nfirst, *rest = [1,2,3,4]\nfirst, *l, last = [1,2,3,4]\n\nAlso Python 3 adds new semantic (refer PEP 3102):\ndef func(arg1, arg2, arg3='default', *, kwarg1='abc', kwarg2='xyz'):\n pass\n\nSuch function accepts only 2 positional arguments, and everything after * can only be passed as keyword argument, not positional one.\nIn Python 2 similar was true for all parameters after *args.\n"} {"input": "How to convert string to lowercase in Python? Is there any way to convert an entire user inputted string from uppercase, or even part uppercase to lowercase? \nE.g. Kilometers --> kilometers.\n", "output": "s = \"Kilometer\"\nprint(s.lower())\n\nOfficial documentation here\n"} {"input": "Converting string into datetime Short and simple. I've got a huge list of date-times like this as strings:\nJun 1 2005 1:33PM\nAug 28 1999 12:00AM\n\nI'm going to be shoving these back into proper datetime fields in a database so I need to magic them into real datetime objects. \nAny help (even if it's just a kick in the right direction) would be appreciated.\nEdit: This is going through Django's ORM so I can't use SQL to do the conversion on insert.\n", "output": "from datetime import datetime\n\ndate_object = datetime.strptime('Jun 1 2005 1:33PM', '%b %d %Y %I:%M%p')\n\nLink to the Python documentation for strptime\nand a link for the strftime format mask\nThis is also a really nice reference for stftime\n"} {"input": "Getting the last element of a list in Python In Python, how do you get the last element of a list?\n", "output": "some_list[-1] is the shortest and most Pythonic.\nIn fact, you can do much more with this syntax. The some_list[-n] syntax gets the nth-to-last element. So some_list[-1] gets the last element, some_list[-2] gets the second to last, etc, all the way down to some_list[-len(some_list)], which gives you the first element.\nYou can also set list elements in this way. For instance:\n>>> some_list = [1, 2, 3]\n>>> some_list[-1] = 5 # Set the last element\n>>> some_list[-2] = 3 # Set the second to last element\n>>> some_list\n[1, 3, 5]\n\n"} {"input": "How do I randomly select an item from a list using Python? Assume I have the following list:\nfoo = ['a', 'b', 'c', 'd', 'e']\n\nWhat is the simplest way to retrieve an item at random from this list?\n", "output": "Use random.choice:\nimport random\n\nfoo = ['a', 'b', 'c', 'd', 'e']\nprint(random.choice(foo))\n\n"} {"input": "How do I sort a list of dictionaries by values of the dictionary in Python? I got a list of dictionaries and want that to be sorted by a value of that dictionary.\nThis\n[{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]\n\nsorted by name, should become\n[{'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}]\n\n", "output": "It may look cleaner using a key instead a cmp:\nnewlist = sorted(list_to_be_sorted, key=lambda k: k['name']) \n\nor as J.F.Sebastian and others suggested,\nfrom operator import itemgetter\nnewlist = sorted(list_to_be_sorted, key=itemgetter('name')) \n\nFor completeness (as pointed out in comments by fitzgeraldsteele), add reverse=True to sort descending\nnewlist = sorted(l, key=itemgetter('name'), reverse=True)\n\n"} {"input": "How do I copy a file in python? How do I copy a file in Python? I couldn't find anything under os.\n", "output": "shutil has many methods you can use. One of which is:\nfrom shutil import copyfile\n\ncopyfile(src, dst)\n\nCopy the contents of the file named src to a file named dst. The destination location must be writable; otherwise, an IOError exception will be raised. If dst already exists, it will be replaced. Special files such as character or block devices and pipes cannot be copied with this function. src and dst are path names given as strings. \n"} {"input": "Manually raising (throwing) an exception in Python How can I raise an exception in Python so that it can later be caught via an except block?\n", "output": "\nHow do I manually throw/raise an exception in Python?\n\nUse the most specific Exception constructor that semantically fits your issue. \nBe specific in your message, e.g.:\nraise ValueError('A very specific bad thing happened')\n\nDon't do this:\nAvoid raising a generic Exception, to catch it, you'll have to catch all other more specific exceptions that subclass it.\nHiding bugs\nraise Exception('I know Python!') # don't, if you catch, likely to hide bugs.\n\nFor example:\ndef demo_bad_catch():\n try:\n raise ValueError('represents a hidden bug, do not catch this')\n raise Exception('This is the exception you expect to handle')\n except Exception as error:\n print('caught this error: ' + repr(error))\n\n>>> demo_bad_catch()\ncaught this error: ValueError('represents a hidden bug, do not catch this',)\n\nWon't catch\nand more specific catches won't catch the general exception:\ndef demo_no_catch():\n try:\n raise Exception('general exceptions not caught by specific handling')\n except ValueError as e:\n print('we will not catch e')\n\n\n>>> demo_no_catch()\nTraceback (most recent call last):\n File \"\", line 1, in \n File \"\", line 3, in demo_no_catch\nException: general exceptions not caught by specific handling\n\nBest Practice:\nInstead, use the most specific Exception constructor that semantically fits your issue.\nraise ValueError('A very specific bad thing happened')\n\nwhich also handily allows an arbitrary number of arguments to be passed to the constructor. This works in Python 2 and 3.\nraise ValueError('A very specific bad thing happened', 'foo', 'bar', 'baz') \n\nThese arguments are accessed by the args attribute on the Exception object. For example:\ntry:\n some_code_that_may_raise_our_value_error()\nexcept ValueError as err:\n print(err.args)\n\nprints \n('message', 'foo', 'bar', 'baz') \n\nIn Python 2.5, an actual message attribute was added to BaseException in favor of encouraging users to subclass Exceptions and stop using args, but the introduction of message and the original deprecation of args has been retracted.\nWhen in except clause\nWhen inside an except clause, you might want to, e.g. log that a specific type of error happened, and then reraise. The best way to do this while preserving the stack trace is to use a bare raise statement, e.g.:\ntry:\n do_something_in_app_that_breaks_easily()\nexcept AppError as error:\n logger.error(error)\n raise # just this!\n # raise AppError # Don't do this, you'll lose the stack trace!\n\nYou can preserve the stacktrace (and error value) with sys.exc_info(), but this is way more error prone, prefer to use a bare raise to reraise. This is the syntax in Python 2:\n raise AppError, error, sys.exc_info()[2] # avoid this.\n # Equivalently, as error *is* the second object:\n raise sys.exc_info()[0], sys.exc_info()[1], sys.exc_info()[2]\n\nIn Python 3:\n raise error.with_traceback(sys.exc_info()[2])\n\nAgain: avoid manually manipulating tracebacks. It's less efficient and more error prone. And if you're using threading and sys.exc_info you may even get the wrong traceback (especially if you're using exception handling for control flow - which I'd personally tend to avoid.)\nPython 3, Exception chaining\nIn Python 3, you can chain Exceptions, which preserve tracebacks:\n raise RuntimeError('specific message') from error\n\nBut beware, this does change the error type raised.\nDeprecated Methods:\nThese can easily hide and even get into production code. You want to raise an exception/error, and doing them will raise an error, but not the one intended!\nValid in Python 2, but not in Python 3 is the following:\nraise ValueError, 'message' # Don't do this, it's deprecated!\n\nOnly valid in much older versions of Python (2.4 and lower), you may still see people raising strings:\nraise 'message' # really really wrong. don't do this.\n\nIn all modern versions, this will actually raise a TypeError, because you're not raising a BaseException type. If you're not checking for the right exception and don't have a reviewer that's aware of the issue, it could get into production.\nExample Usage:\nI raise Exceptions to warn consumers of my API if they're using it incorrectly:\ndef api_func(foo):\n '''foo should be either 'baz' or 'bar'. returns something very useful.'''\n if foo not in _ALLOWED_ARGS:\n raise ValueError('{foo} wrong, use \"baz\" or \"bar\"'.format(foo=repr(foo)))\n\nCreate your own error types when apropos:\n\n\"I want to make an error on purpose, so that it would go into the except\"\n\nYou can create your own error types, if you want to indicate something specific is wrong with your application, just subclass the appropriate point in the exception hierarchy:\nclass MyAppLookupError(LookupError):\n '''raise this when there's a lookup error for my app'''\n\nand usage:\nif important_key not in resource_dict and not ok_to_be_missing:\n raise MyAppLookupError('resource is missing, and that is not ok.')\n\n"} {"input": "Replacements for switch statement in Python? I want to write a function in Python that returns different fixed values based on the value of an input index. \nIn other languages I would use a switch or case statement, but Python does not appear to have a switch statement. What are the recommended Python solutions in this scenario?\n", "output": "If you'd like defaults you could use the dictionary get(key[, default]) method:\ndef f(x):\n return {\n 'a': 1,\n 'b': 2,\n }.get(x, 9) # 9 is default if x not found\n\n"} {"input": "How to print in Python without newline or space? The question is in the title.\nI'd like to do in python what I do in this example in c:\n#include \n\nint main() {\n int i;\n for (i=0; i<10; i++) printf(\".\");\n return 0;\n}\n\nOutput:\n..........\n\nIn Python:\n>>> for i in xrange(0,10): print '.'\n.\n.\n.\n.\n.\n.\n.\n.\n.\n.\n>>> for i in xrange(0,10): print '.',\n. . . . . . . . . .\n\nIn Python print will add a \\n or a space, how can I avoid that? Now, it's just an example. Don't tell me I can first make a string then print it. I'd like to know how to \"append\" strings to the stdout (I don't know if it's worded correctly).\n", "output": "General way\nimport sys\nsys.stdout.write('.')\n\nYou may also need to call\nsys.stdout.flush()\n\nto ensure stdout is flushed immediately.\nPython 2.6+\nFrom Python 2.6 you can import the print function from Python 3:\nfrom __future__ import print_function\n\nThis allows you to use the Python 3 solution below.\nPython 3\nIn Python 3, the print statement has been changed into a function. In Python 3, you can instead do:\nprint('.', end=\"\")\n\nThis also works in Python 2, provided that you've used from __future__ import print_function.\nIf you are having trouble with buffering, you can flush the output by adding flush=True keyword argument:\nprint('.', end=\"\", flush=True)\n\n"} {"input": "Why is \"1000000000000000 in range(1000000000000001)\" so fast in Python 3? It is my understanding that the range() function, which is actually an object type in Python 3, generates its contents on the fly, similar to a generator. \nThis being the case, I would have expected the following line to take an inordinate amount of time, because in order to determine whether 1 quadrillion is in the range, a quadrillion values would have to be generated: \n1000000000000000 in range(1000000000000001)\n\nFurthermore: it seems that no matter how many zeroes I add on, the calculation more or less takes the same amount of time (basically instantaneous). \nI have also tried things like this, but the calculation is still almost instant: \n1000000000000000000000 in range(0,1000000000000000000001,10) # count by tens\n\nIf I try to implement my own range function, the result is not so nice!! \ndef my_crappy_range(N):\n i = 0\n while i < N:\n yield i\n i += 1\n return\n\nWhat is the range() object doing under the hood that makes it so fast? \n\nEDIT: This has turned out to be a much more nuanced topic than I anticipated - there seems to be a bit of history behind the optimization of range(). \nMartijn Pieters' answer was chosen for its completeness, but also see abarnert's first answer for a good discussion of what it means for range to be a full-fledged sequence in Python 3, and some information/warning regarding potential inconsistency for __contains__ function optimization across Python implementations. abarnert's other answer goes into some more detail and provides links for those interested in the history behind the optimization in Python 3 (and lack of optimization of xrange in Python 2). Answers by poke and by wim provide the relevant C source code and explanations for those who are interested. \n", "output": "The Python 3 range() object doesn't produce numbers immediately; it is a smart sequence object that produces numbers on demand. All it contains is your start, stop and step values, then as you iterate over the object the next integer is calculated each iteration.\nThe object also implements the object.__contains__ hook, and calculates if your number is part of its range. Calculating is a O(1) constant time operation. There is never a need to scan through all possible integers in the range.\nFrom the range() object documentation:\n\nThe advantage of the range type over a regular list or tuple is that a range object will always take the same (small) amount of memory, no matter the size of the range it represents (as it only stores the start, stop and step values, calculating individual items and subranges as needed).\n\nSo at a minimum, your range() object would do:\nclass my_range(object):\n def __init__(self, start, stop=None, step=1):\n if stop is None:\n start, stop = 0, start\n self.start, self.stop, self.step = start, stop, step\n if step < 0:\n lo, hi = stop, start\n else:\n lo, hi = start, stop\n self.length = ((hi - lo - 1) // abs(step)) + 1\n\n def __iter__(self):\n current = self.start\n if self.step < 0:\n while current > self.stop:\n yield current\n current += self.step\n else:\n while current < self.stop:\n yield current\n current += self.step\n\n def __len__(self):\n return self.length\n\n def __getitem__(self, i):\n if 0 <= i < self.length:\n return self.start + i * self.step\n raise IndexError('Index out of range: {}'.format(i))\n\n def __contains__(self, num):\n if self.step < 0:\n if not (self.stop < num <= self.start):\n return False\n else:\n if not (self.start <= num < self.stop):\n return False\n return (num - self.start) % self.step == 0\n\nThis is still missing several things that a real range() supports (such as the .index() or .count() methods, hashing, equality testing, or slicing), but should give you an idea.\nI also simplified the __contains__ implementation to only focus on integer tests; if you give a real range() object a non-integer value (including subclasses of int), a slow scan is initiated to see if there is a match, just as if you use a containment test against a list of all the contained values. This was done to continue to support other numeric types that just happen to support equality testing with integers but are not expected to support integer arithmetic as well. See the original Python issue that implemented the containment test.\n"} {"input": "Calling a function of a module from a string with the function's name in Python What is the best way to go about calling a function given a string with the function's name in a Python program. For example, let's say that I have a module foo, and I have a string whose contents are \"bar\". What is the best way to go about calling foo.bar()?\nI need to get the return value of the function, which is why I don't just use eval. I figured out how to do it by using eval to define a temp function that returns the result of that function call, but I'm hoping that there is a more elegant way to do this.\n", "output": "Assuming module foo with method bar:\nimport foo\nmethodToCall = getattr(foo, 'bar')\nresult = methodToCall()\n\nAs far as that goes, lines 2 and 3 can be compressed to:\nresult = getattr(foo, 'bar')()\n\nif that makes more sense for your use case. You can use getattr in this fashion on class instance bound methods, module-level methods, class methods... the list goes on.\n"} {"input": "Find current directory and file's directory In Python, what commands can I use to find:\n\nthe current directory (where I was in the terminal when I ran the Python script), and\nwhere the file I am executing is?\n\n", "output": "To get the full path to the directory a Python file is contained in, write this in that file:\nimport os \ndir_path = os.path.dirname(os.path.realpath(__file__))\n\n(Note that the incantation above won't work if you've already used os.chdir() to change your current working directory, since the value of the __file__ constant is relative to the current working directory and is not changed by an os.chdir() call.)\n\nTo get the current working directory use \nimport os\ncwd = os.getcwd()\n\n\nDocumentation references for the modules, constants and functions used above:\n\nThe os and os.path modules.\nThe __file__ constant\nos.path.realpath(path) (returns \"the canonical path of the specified filename, eliminating any symbolic links encountered in the path\")\nos.path.dirname(path) (returns \"the directory name of pathname path\")\nos.getcwd() (returns \"a string representing the current working directory\")\nos.chdir(path) (\"change the current working directory to path\")\n\n"} {"input": "Reverse a string in Python There is no built in reverse function in Python's str object. What is the best way of implementing this?\nIf supplying a very concise answer, please elaborate on it's efficiency. Is the str converted to a different object, etc. \n", "output": "How about:\n>>> 'hello world'[::-1]\n'dlrow olleh'\n\nThis is extended slice syntax. It works by doing [begin:end:step] - by leaving begin and end off and specifying a step of -1, it reverses a string.\n"} {"input": "What is __init__.py for? What is __init__.py for in a Python source directory?\n", "output": "It's a part of a package. Here's the documentation.\n\nThe __init__.py files are required to make Python treat the directories as containing packages; this is done to prevent directories with a common name, such as string, from unintentionally hiding valid modules that occur later (deeper) on the module search path. In the simplest case, __init__.py can just be an empty file, but it can also execute initialization code for the package or set the __all__ variable, described later.\n\n"} {"input": "How to know if an object has an attribute in Python Is there a way in Python to determine if an object has some attribute? For example:\n>>> a = SomeClass()\n>>> a.someProperty = value\n>>> a.property\nTraceback (most recent call last):\n File \"\", line 1, in \nAttributeError: SomeClass instance has no attribute 'property'\n\nHow can you tell if a has the attribute property before using it?\n", "output": "Try hasattr():\nif hasattr(a, 'property'):\n a.property\n\nEDIT: See zweiterlinde's answer below, who offers good advice about asking forgiveness! A very pythonic approach! \nThe general practice in python is that, if the property is likely to be there most of the time, simply call it and either let the exception propagate, or trap it with a try/except block. This will likely be faster than hasattr. If the property is likely to not be there most of the time, or you're not sure, using hasattr will probably be faster than repeatedly falling into an exception block.\n"} {"input": "How do you append to a file in Python? How do you append to the file instead of overwriting it? Is there a special function that appends to the file?\n", "output": "with open(\"test.txt\", \"a\") as myfile:\n myfile.write(\"appended text\")\n\n"} {"input": "Python @classmethod and @staticmethod for beginner? Could someone explain to me the meaning of @classmethod and @staticmethod in python? I need to know the difference and the meaning. \nAs far as I understand, @classmethod tells a class that it's a method which should be inherited into subclasses, or... something. However, what's the point of that? Why not just define the class method without adding @classmethod or @staticmethod or any @ definitions?\ntl;dr: when should I use them, why should I use them, and how should I use them?\nI'm pretty advanced with C++, so using more advanced programming concepts shouldn't be a problem. Feel free giving me a corresponding C++ example if possible.\n", "output": "Though classmethod and staticmethod are quite similar, there's a slight difference in usage for both entities: classmethod must have a reference to a class object as the first parameter, whereas staticmethod can have no parameters at all.\nLet's look at all that was said in real examples.\nBoilerplate\nLet's assume an example of a class, dealing with date information (this is what will be our boilerplate to cook on):\nclass Date(object):\n\n day = 0\n month = 0\n year = 0\n\n def __init__(self, day=0, month=0, year=0):\n self.day = day\n self.month = month\n self.year = year\n\nThis class obviously could be used to store information about certain dates (without timezone information; let's assume all dates are presented in UTC).\nHere we have __init__, a typical initializer of Python class instances, which receives arguments as a typical instancemethod, having the first non-optional argument (self) that holds reference to a newly created instance.\nClass Method\nWe have some tasks that can be nicely done using classmethods.\nLet's assume that we want to create a lot of Date class instances having date information coming from outer source encoded as a string of next format ('dd-mm-yyyy'). We have to do that in different places of our source code in project.\nSo what we must do here is:\n\nParse a string to receive day, month and year as three integer variables or a 3-item tuple consisting of that variable.\nInstantiate Date by passing those values to initialization call.\n\nThis will look like:\nday, month, year = map(int, string_date.split('-'))\ndate1 = Date(day, month, year)\n\nFor this purpose, C++ has such feature as overloading, but Python lacks that feature- so here's when classmethod applies. Lets create another \"constructor\".\n @classmethod\n def from_string(cls, date_as_string):\n day, month, year = map(int, date_as_string.split('-'))\n date1 = cls(day, month, year)\n return date1\n\ndate2 = Date.from_string('11-09-2012')\n\nLet's look more carefully at the above implementation, and review what advantages we have here:\n\nWe've implemented date string parsing in one place and it's reusable now.\nEncapsulation works fine here (if you think that you could implement string parsing as a single function elsewhere, this solution fits OOP paradigm far better).\ncls is an object that holds class itself, not an instance of the class. It's pretty cool because if we inherit our Date class, all children will have from_string defined also.\n\nStatic method\nWhat about staticmethod? It's pretty similar to classmethod but doesn't take any obligatory parameters (like a class method or instance method does).\nLet's look at the next use case.\nWe have a date string that we want to validate somehow. This task is also logically bound to Date class we've used so far, but still doesn't require instantiation of it.\nHere is where staticmethod can be useful. Let's look at the next piece of code:\n @staticmethod\n def is_date_valid(date_as_string):\n day, month, year = map(int, date_as_string.split('-'))\n return day <= 31 and month <= 12 and year <= 3999\n\n # usage:\n is_date = Date.is_date_valid('11-09-2012')\n\nSo, as we can see from usage of staticmethod, we don't have any access to what the class is- it's basically just a function, called syntactically like a method, but without access to the object and it's internals (fields and another methods), while classmethod does.\n"} {"input": "How can you profile a Python script? Project Euler and other coding contests often have a maximum time to run or people boast of how fast their particular solution runs. With python, sometimes the approaches are somewhat kludgey - i.e., adding timing code to __main__.\nWhat is a good way to profile how long a python program takes to run?\n", "output": "Python includes a profiler called cProfile. It not only gives the total running time, but also times each function separately, and tells you how many times each function was called, making it easy to determine where you should make optimizations.\nYou can call it from within your code, or from the interpreter, like this:\nimport cProfile\ncProfile.run('foo()')\n\nEven more usefully, you can invoke the cProfile when running a script:\npython -m cProfile myscript.py\n\nTo make it even easier, I made a little batch file called 'profile.bat':\npython -m cProfile %1\n\nSo all I have to do is run:\nprofile euler048.py\n\nAnd I get this:\n1007 function calls in 0.061 CPU seconds\n\nOrdered by: standard name\nncalls tottime percall cumtime percall filename:lineno(function)\n 1 0.000 0.000 0.061 0.061 :1()\n 1000 0.051 0.000 0.051 0.000 euler048.py:2()\n 1 0.005 0.005 0.061 0.061 euler048.py:2()\n 1 0.000 0.000 0.061 0.061 {execfile}\n 1 0.002 0.002 0.053 0.053 {map}\n 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler objects}\n 1 0.000 0.000 0.000 0.000 {range}\n 1 0.003 0.003 0.003 0.003 {sum}\n\nEDIT: Updated link to a good video resource from PyCon 2013: http://lanyrd.com/2013/pycon/scdywg/\n"} {"input": "Find all files in directory with extension .txt in Python How can I find all files in directory with the extension .txt in python?\n", "output": "You can use glob:\nimport glob, os\nos.chdir(\"/mydir\")\nfor file in glob.glob(\"*.txt\"):\n print(file)\n\nor simply os.listdir:\nimport os\nfor file in os.listdir(\"/mydir\"):\n if file.endswith(\".txt\"):\n print(file)\n\nor if you want to traverse directory, use os.walk:\nimport os\nfor root, dirs, files in os.walk(\"/mydir\"):\n for file in files:\n if file.endswith(\".txt\"):\n print(os.path.join(root, file))\n\n"} {"input": "Create a dictionary with list comprehension in Python I like the Python list comprehension syntax.\nCan it be used to create dictionaries too? For example, by iterating over pairs of keys and values:\nmydict = {(k,v) for (k,v) in blah blah blah} # doesn't work :(\n\n", "output": "In Python 2.6 and earlier, the dict constructor can receive an iterable of key/value pairs:\nd = dict((key, value) for (key, value) in iterable)\n\nFrom Python 2.7 and 3 onwards, you can just use the dict comprehension syntax directly:\nd = {key: value for (key, value) in iterable}\n\nOf course, you can use the iterable in any way you want (tuples and lists literals, generator comprehensions, list comprehensions, generator functions, functional composition... feel creative) as long as each element is an iterable itself of two elements:\nd = {value: foo(value) for value in sequence if bar(value)}\n\ndef key_value_gen(k):\n yield chr(k+65)\n yield chr((k+13)%26+65)\nd = dict(map(key_value_gen, range(26)))\n\n"} {"input": "error: Unable to find vcvarsall.bat I tried to install the Python package dulwich:\npip install dulwich\n\nBut I get a cryptic error message:\nerror: Unable to find vcvarsall.bat\n\nThe same happens if I try installing the package manually:\n> python setup.py install\nrunning build_ext\nbuilding 'dulwich._objects' extension\nerror: Unable to find vcvarsall.bat\n\n", "output": "Update: Comments point out that the instructions here may be dangerous. Consider using the Visual C++ 2008 Express edition or the purpose-built Microsoft Visual C++ Compiler for Python (details) and NOT using the original answer below. Original error message means the required version of Visual C++ is not installed.\n\nFor Windows installations:\nWhile running setup.py for package installations, Python 2.7 searches for an installed Visual Studio 2008. You can trick Python to use a newer Visual Studio by setting the correct path in VS90COMNTOOLS environment variable before calling setup.py.\nExecute the following command based on the version of Visual Studio installed:\n\nVisual Studio 2010 (VS10): SET VS90COMNTOOLS=%VS100COMNTOOLS%\nVisual Studio 2012 (VS11): SET VS90COMNTOOLS=%VS110COMNTOOLS%\nVisual Studio 2013 (VS12): SET VS90COMNTOOLS=%VS120COMNTOOLS%\nVisual Studio 2015 (VS14): SET VS90COMNTOOLS=%VS140COMNTOOLS%\n\n\nWARNING: As noted below, this answer is unlikely to work if you are trying to compile python modules.\nSee Building lxml for Python 2.7 on Windows for details.\n"} {"input": "How do I delete a file or folder in Python? How do I delete a file or folder in Python?\nI am using the Python scripting language running on the Windows XP operating system.\n", "output": "os.remove() will remove a file.\nos.rmdir() will remove an empty directory.\nshutil.rmtree() will delete a directory and all its contents.\n"} {"input": "Getting the class name of an instance in Python How do I find out a name of class that created an instance of an object in Python if the function I am doing this from is the base class of which the class of the instance has been derived?\nWas thinking maybe the inspect module might have helped me out here, but it doesn't seem to give me what I want. And short of parsing the __class__ member, I'm not sure how to get at this information.\n", "output": "Have you tried the __name__ attribute of the class? ie type(x).__name__ will give you the name of the class, which I think is what you want.\n>>> import itertools\n>>> x = itertools.count(0)\n>>> type(x).__name__\n'count'\n\nThis method works with new-style classes only. Your code might use some old-style classes. The following works for both:\nx.__class__.__name__\n\n"} {"input": "Random string generation with upper case letters and digits in Python I want to generate a string of size N.\nIt should be made up of numbers and uppercase English letters such as:\n\n6U1S75\n4Z4UKK\nU911K4\n\nHow can I achieve this in a pythonic way?\n", "output": "Answer in one line:\n''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))\n\nA more secure version; see http://stackoverflow.com/a/23728630/2213647:\n''.join(random.SystemRandom().choice(string.ascii_uppercase + string.digits) for _ in range(N))\n\nIn details, with a clean function for further reuse:\n>>> import string\n>>> import random\n>>> def id_generator(size=6, chars=string.ascii_uppercase + string.digits):\n... return ''.join(random.choice(chars) for _ in range(size))\n...\n>>> id_generator()\n'G5G74W'\n>>> id_generator(3, \"6793YUIO\")\n'Y3U'\n\nHow does it work ?\nWe import string, a module that contains sequences of common ASCII characters, and random, a module that deals with random generation.\nstring.ascii_uppercase + string.digits just concatenates the list of characters representing uppercase ASCII chars and digits:\n>>> string.ascii_uppercase\n'ABCDEFGHIJKLMNOPQRSTUVWXYZ'\n>>> string.digits\n'0123456789'\n>>> string.ascii_uppercase + string.digits\n'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'\n\nThen we use a list comprehension to create a list of 'n' elements:\n>>> range(4) # range create a list of 'n' numbers\n[0, 1, 2, 3]\n>>> ['elem' for _ in range(4)] # we use range to create 4 times 'elem'\n['elem', 'elem', 'elem', 'elem']\n\nIn the example above, we use [ to create the list, but we don't in the id_generator function so Python doesn't create the list in memory, but generates the elements on the fly, one by one (more about this here).\nInstead of asking to create 'n' times the string elem, we will ask Python to create 'n' times a random character, picked from a sequence of characters:\n>>> random.choice(\"abcde\")\n'a'\n>>> random.choice(\"abcde\")\n'd'\n>>> random.choice(\"abcde\")\n'b'\n\nTherefore random.choice(chars) for _ in range(size) really is creating a sequence of size characters. Characters that are randomly picked from chars:\n>>> [random.choice('abcde') for _ in range(3)]\n['a', 'b', 'b']\n>>> [random.choice('abcde') for _ in range(3)]\n['e', 'b', 'e']\n>>> [random.choice('abcde') for _ in range(3)]\n['d', 'a', 'c']\n\nThen we just join them with an empty string so the sequence becomes a string:\n>>> ''.join(['a', 'b', 'b'])\n'abb'\n>>> [random.choice('abcde') for _ in range(3)]\n['d', 'c', 'b']\n>>> ''.join(random.choice('abcde') for _ in range(3))\n'dac'\n\n"} {"input": "How do I connect to a MySQL Database in Python? How do I connect to a MySQL database using a python program?\n", "output": "\nConnecting to MYSQL with Python in 3 steps\n1 - Setting\nYou must install a MySQL driver before doing anything. Unlike PHP, only the SQLite driver is installed by default with Python. The most used package to do so is MySQLdb but it's hard to install it using easy_install.\nFor Windows user, you can get an exe of MySQLdb. \nFor Linux, this is a casual package (python-mysqldb). (You can use sudo apt-get install python-mysqldb (for debian based distros), yum install mysql-python (for rpm-based), or dnf install python-mysql (for modern fedora distro) in command line to download.)\nFor Mac, you can install MySQLdb using Macport.\n2 - Usage\nAfter installing, reboot. This is not mandatory, but will prevent me from answering 3 or 4 other questions in this post if something goes wrong. So please reboot.\nThen it is just like using another package :\n#!/usr/bin/python\nimport MySQLdb\n\ndb = MySQLdb.connect(host=\"localhost\", # your host, usually localhost\n user=\"john\", # your username\n passwd=\"megajonhy\", # your password\n db=\"jonhydb\") # name of the data base\n\n# you must create a Cursor object. It will let\n# you execute all the queries you need\ncur = db.cursor()\n\n# Use all the SQL you like\ncur.execute(\"SELECT * FROM YOUR_TABLE_NAME\")\n\n# print all the first cell of all the rows\nfor row in cur.fetchall():\n print row[0]\n\ndb.close()\n\nOf course, there are thousand of possibilities and options; this is a very basic example. You will have to look at the documentation. A good starting point.\n3 - More advanced usage\nOnce you know how it works, you may want to use an ORM to avoid writting SQL manually and manipulate your tables as they were Python objects. The most famous ORM in the Python community is SQLAlchemy. \nI strongly advise you to use it: your life is going to be much easier.\nI recently discovered another jewel in the Python world: peewee. It's a very lite ORM, really easy and fast to setup then use. It makes my day for small projects or stand alone apps, where using big tools like SQLAlchemy or Django is overkill :\nimport peewee\nfrom peewee import *\n\ndb = MySQLDatabase('jonhydb', user='john',passwd='megajonhy')\n\nclass Book(peewee.Model):\n author = peewee.CharField()\n title = peewee.TextField()\n\n class Meta:\n database = db\n\nBook.create_table()\nbook = Book(author=\"me\", title='Peewee is cool')\nbook.save()\nfor book in Book.filter(author=\"me\"):\n print book.title\n\nPeewee is cool\n\nThis example works out of the box. Nothing other than having peewee (pip install peewee :-)) is required. No complicated setup. It's really cool.\n"} {"input": "How to print to stderr in Python? I've come across at least three ways to print to stderr:\n import sys\n\n print >> sys.stderr, 'spam'\n\n sys.stderr.write('spam\\n')\n\n from __future__ import print_function\n print('spam', file=sys.stderr)\n\nIt seems to contradict zen of Python #13 \u00e2\u0080\u00a0, so what's the preferred way to do it? Are there any advantages or disadvantages to one way or the other?\n\u00e2\u0080\u00a0 There should be one \u00e2\u0080\u0094 and preferably only one \u00e2\u0080\u0094 obvious way to do it.\n", "output": "I found this to be the only one short + flexible + portable + readable:\nfrom __future__ import print_function\nimport sys\n\ndef eprint(*args, **kwargs):\n print(*args, file=sys.stderr, **kwargs)\n\nThe function eprint can be used in the same was as the standard print function:\n>>> print(\"Test\")\nTest\n>>> eprint(\"Test\")\nTest\n>>> eprint(\"foo\", \"bar\", \"baz\", sep=\"---\")\nfoo---bar---baz\n\n"} {"input": "The meaning of a single- and a double-underscore before an object name in Python I want to clear this up once and for all. Can someone please explain the exact meaning of having leading underscores before an object's name in Python? Also explain the difference between a single and a double leading underscore. Also, does that meaning stay the same whether the object in question is a variable, a function, a method, etcetera?\n", "output": "Single Underscore\nNames, in a class, with a leading underscore are simply to indicate to other programmers that the attribute or method is intended to be private. However, nothing special is done with the name itself.\nTo quote PEP-8:\n\n_single_leading_underscore: weak \"internal use\" indicator. E.g. from M import * does not import objects whose name starts with an underscore.\n\nDouble Underscore (Name Mangling)\nFrom the Python docs:\n\nAny identifier of the form __spam (at least two leading underscores, at most one trailing underscore) is textually replaced with _classname__spam, where classname is the current class name with leading underscore(s) stripped. This mangling is done without regard to the syntactic position of the identifier, so it can be used to define class-private instance and class variables, methods, variables stored in globals, and even variables stored in instances. private to this class on instances of other classes. \n\nAnd a warning from the same page:\n\nName mangling is intended to give classes an easy way to define \u00e2\u0080\u009cprivate\u00e2\u0080\u009d instance variables and methods, without having to worry about instance variables defined by derived classes, or mucking with instance variables by code outside the class. Note that the mangling rules are designed mostly to avoid accidents; it still is possible for a determined soul to access or modify a variable that is considered private.\n\nExample\n>>> class MyClass():\n... def __init__(self):\n... self.__superprivate = \"Hello\"\n... self._semiprivate = \", world!\"\n...\n>>> mc = MyClass()\n>>> print mc.__superprivate\nTraceback (most recent call last):\n File \"\", line 1, in \nAttributeError: myClass instance has no attribute '__superprivate'\n>>> print mc._semiprivate\n, world!\n>>> print mc.__dict__\n{'_MyClass__superprivate': 'Hello', '_semiprivate': ', world!'}\n\n"} {"input": "How to leave/exit/deactivate a python virtualenv? I'm using virtualenv and the virtualenvwrapper. I can switch between virtualenv's just fine using the workon command. \nme@mymachine:~$ workon env1\n(env1)me@mymachine:~$ workon env2\n(env2)me@mymachine:~$ workon env1\n(env1)me@mymachine:~$\n\nHowever, how do I exit all virtual machines and workon my real machine again? Right now, the only way I have of getting back to\nme@mymachine:~$\n\nis to exit the shell and start a new one. That's kind of annoying. Is there a command to workon \"nothing\", and if so, what is it? If such a command does not exist, how would I go about creating it?\n", "output": "Usually, activating a virtualenv gives you a shell function named:\n$ deactivate\n\nwhich puts things back to normal.\nEdit: I have just looked specifically again at the code for virtualenvwrapper, and, yes, it too supports \"deactivate\" as the way to escape from all virtualenvs.\nEdit: If you are trying to leave an Anaconda environment, the procedure is a bit different: run the two-word command source deactivate since they implement deactivation using a stand-alone script.\nbash-4.3$ deactivate\npyenv-virtualenv: deactivate must be sourced. Run 'source deactivate' instead of 'deactivate'\nbash-4.3$ source deactivate\npyenv-virtualenv: no virtualenv has been activated.\n\n"} {"input": "Python progression path - From apprentice to guru I've been learning, working, and playing with Python for a year and a half now. As a biologist slowly making the turn to bio-informatics, this language has been at the very core of all the major contributions I have made in the lab. I more or less fell in love with the way Python permits me to express beautiful solutions and also with the semantics of the language that allows such a natural flow from thoughts to workable code.\nWhat I would like to know is your answer to a kind of question I have seldom seen in this or other forums. This question seems central to me for anyone on the path to Python improvement but who wonders what his next steps should be.\nLet me sum up what I do NOT want to ask first ;)\n\nI don't want to know how to QUICKLY learn Python\nNor do I want to find out the best way to get acquainted with the language\nFinally, I don't want to know a 'one trick that does it all' approach.\n\nWhat I do want to know your opinion about, is:\nWhat are the steps YOU would recommend to a Python journeyman, from apprenticeship to guru status (feel free to stop wherever your expertise dictates it), in order that one IMPROVES CONSTANTLY, becoming a better and better Python coder, one step at a time. Some of the people on SO almost seem worthy of worship for their Python prowess, please enlighten us :)\nThe kind of answers I would enjoy (but feel free to surprise the readership :P ), is formatted more or less like this:\n\nRead this (eg: python tutorial), pay attention to that kind of details\nCode for so manytime/problems/lines of code\nThen, read this (eg: this or that book), but this time, pay attention to this\nTackle a few real-life problems\nThen, proceed to reading Y.\nBe sure to grasp these concepts\nCode for X time\nCome back to such and such basics or move further to...\n(you get the point :)\n\nI really care about knowing your opinion on what exactly one should pay attention to, at various stages, in order to progress CONSTANTLY (with due efforts, of course). If you come from a specific field of expertise, discuss the path you see as appropriate in this field.\nEDIT: Thanks to your great input, I'm back on the Python improvement track! I really appreciate!\n", "output": "I thought the process of Python mastery went something like:\n\nDiscover list comprehensions\nDiscover generators\nIncorporate map, reduce, filter, iter, range, xrange often into your code\nDiscover Decorators\nWrite recursive functions, a lot\nDiscover itertools and functools\nRead Real World Haskell (read free online)\nRewrite all your old Python code with tons of higher order functions, recursion, and whatnot.\nAnnoy your cubicle mates every time they present you with a Python class. Claim it could be \"better\" implemented as a dictionary plus some functions. Embrace functional programming.\nRediscover the Strategy pattern and then all those things from imperative code you tried so hard to forget after Haskell.\nFind a balance.\n\n"} {"input": "Converting integer to string in Python? I want to convert an integer to a string in Python. I am typecasting it in vain:\nt=raw_input()\nc=[]\nfor j in range(0,int(t)):\n n=raw_input()\n a=[]\n a,b= (int(i) for i in n.split(' '))\n d=pow(a,b)\n d.str()\n c.append(d[0])\nfor j in c:\n print j\n\nWhen I try to convert it to string, it's showing an error like int doesn't have any attribute called str.\n", "output": ">>> str(10)\n'10'\n>>> int('10')\n10\n\n[Edit]\nLinks to the documentation:\nint()\nstr()\n[Edit]\nThe problem seems to come from this line: d.str()\nConversion to string is done with the builtin str() function, which basically calls the __str__() method of its parameter.\nAlso, it shouldn't be necessary to call pow(). Try using the ** operator.\n"} {"input": "Nicest way to pad zeroes to string What is the nicest/shortest way to pad a string with zeroes to the left, so the string length has a specific length?\n", "output": "Strings:\n>>> n = '4'\n>>> print n.zfill(3)\n004\n\nAnd for numbers:\n>>> n = 4\n>>> print '%03d' % n\n004\n>>> print format(4, '03') # python >= 2.6\n004\n>>> print '{0:03d}'.format(4) # python >= 2.6\n004\n>>> print '{foo:03d}'.format(foo=4) # python >= 2.6\n004\n>>> print('{:03d}'.format(4)) # python >= 2.7 + python3\n004\n>>> print('{0:03d}'.format(4)) # python 3\n004\n\nString formatting documentation.\n"} {"input": "Peak detection in a 2D array I'm helping a veterinary clinic measuring pressure under a dogs paw. I use Python for my data analysis and now I'm stuck trying to divide the paws into (anatomical) subregions.\nI made a 2D array of each paw, that consists of the maximal values for each sensor that has been loaded by the paw over time. Here's an example of one paw, where I used Excel to draw the areas I want to 'detect'. These are 2 by 2 boxes around the sensor with local maxima's, that together have the largest sum.\n\nSo I tried some experimenting and decide to simply look for the maximums of each column and row (can't look in one direction due to the shape of the paw). This seems to 'detect' the location of the separate toes fairly well, but it also marks neighboring sensors. \n\nSo what would be the best way to tell Python which of these maximums are the ones I want? \nNote: The 2x2 squares can't overlap, since they have to be separate toes!\nAlso I took 2x2 as a convenience, any more advanced solution is welcome, but I'm simply a human movement scientist, so I'm neither a real programmer or a mathematician, so please keep it 'simple'. \nHere's a version that can be loaded with np.loadtxt\n\nResults\nSo I tried @jextee's solution (see the results below). As you can see, it works very on the front paws, but it works less well for the hind legs. \nMore specifically, it can't recognize the small peak that's the fourth toe. This is obviously inherent to the fact that the loop looks top down towards the lowest value, without taking into account where this is. \nWould anyone know how to tweak @jextee's algorithm, so that it might be able to find the 4th toe too?\n\nSince I haven't processed any other trials yet, I can't supply any other samples. But the data I gave before were the averages of each paw. This file is an array with the maximal data of 9 paws in the order they made contact with the plate.\nThis image shows how they were spatially spread out over the plate.\n\nUpdate:\nI have set up a blog for anyone interested and I have setup a SkyDrive with all the raw measurements. So to anyone requesting more data: more power to you! \n\nNew update:\nSo after the help I got with my questions regarding paw detection and paw sorting, I was finally able to check the toe detection for every paw! Turns out, it doesn't work so well in anything but paws sized like the one in my own example. Off course in hindsight, it's my own fault for choosing the 2x2 so arbitrarily.\nHere's a nice example of where it goes wrong: a nail is being recognized as a toe and the 'heel' is so wide, it gets recognized twice!\n\nThe paw is too large, so taking a 2x2 size with no overlap, causes some toes to be detected twice. The other way around, in small dogs it often fails to find a 5th toe, which I suspect is being caused by the 2x2 area being too large.\nAfter trying the current solution on all my measurements I came to the staggering conclusion that for nearly all my small dogs it didn't find a 5th toe and that in over 50% of the impacts for the large dogs it would find more!\nSo clearly I need to change it. My own guess was changing the size of the neighborhood to something smaller for small dogs and larger for large dogs. But generate_binary_structure wouldn't let me change the size of the array. \nTherefore, I'm hoping that anyone else has a better suggestion for locating the toes, perhaps having the toe area scale with the paw size?\n", "output": "I detected the peaks using a local maximum filter. Here is the result on your first dataset of 4 paws:\n\nI also ran it on the second dataset of 9 paws and it worked as well.\nHere is how you do it:\nimport numpy as np\nfrom scipy.ndimage.filters import maximum_filter\nfrom scipy.ndimage.morphology import generate_binary_structure, binary_erosion\nimport matplotlib.pyplot as pp\n\n#for some reason I had to reshape. Numpy ignored the shape header.\npaws_data = np.loadtxt(\"paws.txt\").reshape(4,11,14)\n\n#getting a list of images\npaws = [p.squeeze() for p in np.vsplit(paws_data,4)]\n\n\ndef detect_peaks(image):\n \"\"\"\n Takes an image and detect the peaks usingthe local maximum filter.\n Returns a boolean mask of the peaks (i.e. 1 when\n the pixel's value is the neighborhood maximum, 0 otherwise)\n \"\"\"\n\n # define an 8-connected neighborhood\n neighborhood = generate_binary_structure(2,2)\n\n #apply the local maximum filter; all pixel of maximal value \n #in their neighborhood are set to 1\n local_max = maximum_filter(image, footprint=neighborhood)==image\n #local_max is a mask that contains the peaks we are \n #looking for, but also the background.\n #In order to isolate the peaks we must remove the background from the mask.\n\n #we create the mask of the background\n background = (image==0)\n\n #a little technicality: we must erode the background in order to \n #successfully subtract it form local_max, otherwise a line will \n #appear along the background border (artifact of the local maximum filter)\n eroded_background = binary_erosion(background, structure=neighborhood, border_value=1)\n\n #we obtain the final mask, containing only peaks, \n #by removing the background from the local_max mask\n detected_peaks = local_max - eroded_background\n\n return detected_peaks\n\n\n#applying the detection and plotting results\nfor i, paw in enumerate(paws):\n detected_peaks = detect_peaks(paw)\n pp.subplot(4,2,(2*i+1))\n pp.imshow(paw)\n pp.subplot(4,2,(2*i+2) )\n pp.imshow(detected_peaks)\n\npp.show()\n\nAll you need to do after is use scipy.ndimage.measurements.label on the mask to label all distinct objects. Then you'll be able to play with them individually.\nNote that the method works well because the background is not noisy. If it were, you would detect a bunch of other unwanted peaks in the background. Another important factor is the size of the neighborhood. You will need to adjust it if the peak size changes (the should remain roughly proportional).\n"} {"input": "Access environment variables from Python I set an environment variable that I want to access in my Python application. How do I get this value?\n", "output": "Environment variables are accessed through os.environ\nimport os\nprint os.environ['HOME']\n\nOr you can see a list of all the environment variables using:\nos.environ\n\nAs sometimes you might need to see a complete list!\n# using get will return `None` if a key is not present rather than raise a `KeyError`\nprint os.environ.get('KEY_THAT_MIGHT_EXIST')\n\n# os.getenv is equivalent, and can also give a default value instead of `None`\nprint os.getenv('KEY_THAT_MIGHT_EXIST', default_value)\n\nPython default installation on Windows is C:\\Python. If you want to find out while running python you can do:\nimport sys\nprint sys.prefix\n\n"} {"input": "How to trim whitespace (including tabs)? Is there a function that will trim not only spaces for whitespace, but also tabs?\n", "output": "Whitespace on the both sides:\ns = \" \\t a string example\\t \"\ns = s.strip()\n\nWhitespace on the right side:\ns = s.rstrip()\n\nWhitespace on the left side:\ns = s.lstrip()\n\nAs thedz points out, you can provide an argument to strip arbitrary characters to any of these functions like this:\ns = s.strip(' \\t\\n\\r')\n\nThis will strip any space, \\t, \\n, or \\r characters from the left-hand side, right-hand side, or both sides of the string. \nThe examples above only remove strings from the left-hand and right-hand sides of strings. If you want to also remove characters from the middle of a string, try re.sub:\nimport re\nprint re.sub('[\\s+]', '', s)\n\nThat should print out:\nastringexample\n\n"} {"input": "How to remove an element from a list by index in Python? How to remove an element from a list by index in Python?\nI found the list.remove method, but say I want to remove the last element, how do I do this? It seems like the default remove searches the list, but I don't want any search to be performed.\n", "output": "Use del and specify the element you want to delete with the index:\nIn [9]: a = range(10)\nIn [10]: a\nOut[10]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\nIn [11]: del a[-1]\nIn [12]: a\nOut[12]: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nHere is the section from the tutorial.\n"} {"input": "Why does Python code run faster in a function? def main():\n for i in xrange(10**8):\n pass\nmain()\n\nThis piece of code in Python runs in \nreal 0m1.841s\nuser 0m1.828s\nsys 0m0.012s\n\nHowever, if the for loop isn't placed within a function, \nfor i in xrange(10**8):\n pass\n\nthen it runs for a much longer time:\nreal 0m4.543s\nuser 0m4.524s\nsys 0m0.012s\n\nWhy is this?\nNote: The timing is done with the time function in BASH in Linux.\n", "output": "Inside a function, the bytecode is\n 2 0 SETUP_LOOP 20 (to 23)\n 3 LOAD_GLOBAL 0 (xrange)\n 6 LOAD_CONST 3 (100000000)\n 9 CALL_FUNCTION 1\n 12 GET_ITER \n >> 13 FOR_ITER 6 (to 22)\n 16 STORE_FAST 0 (i)\n\n 3 19 JUMP_ABSOLUTE 13\n >> 22 POP_BLOCK \n >> 23 LOAD_CONST 0 (None)\n 26 RETURN_VALUE \n\nAt top level, the bytecode is\n 1 0 SETUP_LOOP 20 (to 23)\n 3 LOAD_NAME 0 (xrange)\n 6 LOAD_CONST 3 (100000000)\n 9 CALL_FUNCTION 1\n 12 GET_ITER \n >> 13 FOR_ITER 6 (to 22)\n 16 STORE_NAME 1 (i)\n\n 2 19 JUMP_ABSOLUTE 13\n >> 22 POP_BLOCK \n >> 23 LOAD_CONST 2 (None)\n 26 RETURN_VALUE \n\nThe difference is that STORE_FAST is faster (!) than STORE_NAME. This is because in a function, i is a local but at toplevel it is a global.\nTo examine bytecode, use the dis module. I was able to disassemble the function directly, but to disassemble the toplevel code I had to use the compile builtin.\n"} {"input": "Trimming a string in Python How do I remove leading and trailing whitespace from a string in Python?\nFor example:\n\" Hello \" --> \"Hello\"\n\" Hello\" --> \"Hello\"\n\"Hello \" --> \"Hello\"\n\"Bob has a cat\" --> \"Bob has a cat\"\n\n", "output": "Just one space, or all such spaces? If the second, then strings already have a .strip() method:\n>>> ' Hello '.strip()\n'Hello'\n>>> ' Hello'.strip()\n'Hello'\n>>> 'Bob has a cat'.strip()\n'Bob has a cat'\n>>> ' Hello '.strip() # ALL spaces at ends removed\n'Hello'\n\nIf you need only to remove one space however, you could do it with:\ndef strip_one_space(s):\n if s.endswith(\" \"): s = s[:-1]\n if s.startswith(\" \"): s = s[1:]\n return s\n\n>>> strip_one_space(\" Hello \")\n' Hello'\n\nAlso, note that str.strip() removes other whitespace characters as well (e.g. tabs and newlines). To remove only spaces, you can specify the character to remove as an argument to strip, i.e.:\n>>> \" Hello\\n\".strip(\" \")\n'Hello\\n'\n\n"} {"input": "Limiting floats to two decimal points I want a to be rounded to 13.95.\n>>> a\n13.949999999999999\n>>> round(a, 2)\n13.949999999999999\n\nThe round function does not work the way I expected.\n", "output": "You are running into the old problem with floating point numbers that all numbers cannot be represented. The command line is just showing you the full floating point form from memory. In floating point your rounded version is the same number. Since computers are binary they store floating point numbers as an integer and then divide it by a power of two so 13.95 will be represented in a similar fashion to 125650429603636838/(2**53). Double precision numbers have 53 bits (16 digits) of precision and regular floats have 24 bits (8 digits) of precision. The floating point in python uses double precision to store the values.\nfor example\n >>> 125650429603636838/(2**53)\n 13.949999999999999\n\n >>> 234042163/(2**24)\n 13.949999988079071\n\n >>> a=13.946\n >>> print(a)\n 13.946\n >>> print(\"%.2f\" % a)\n 13.95\n >>> round(a,2)\n 13.949999999999999\n >>> print(\"%.2f\" % round(a,2))\n 13.95\n >>> print(\"{0:.2f}\".format(a))\n 13.95\n >>> print(\"{0:.2f}\".format(round(a,2)))\n 13.95\n >>> print(\"{0:.15f}\".format(round(a,2)))\n 13.949999999999999\n\nIf you are after only two decimal places as in currency then you have a couple of better choices use integers and store values in cents not dollars and then divide by 100 to convert to dollars. Or use a fixed point number like decimal\n"} {"input": "How to determine the variable type in Python? How do I see the type of a variable whether it is unsigned 32 bit, signed 16 bit, etc.?\nHow do I view it?\n", "output": "Python doesn't have the same types as C/C++, which appears to be your question.\nTry this:\n>>> i = 123\n>>> type(i)\n\n>>> type(i) is int\nTrue\n>>> i = 123456789L\n>>> type(i)\n\n>>> type(i) is long\nTrue\n>>> i = 123.456\n>>> type(i)\n\n>>> type(i) is float\nTrue\n\nThe distinction between int and long goes away in Python 3.0, though.\n"} {"input": "How to flush output of Python print? How do I force Python's print function to output to the screen?\n", "output": "import sys\nsys.stdout.flush()\n\nPrint by default prints to sys.stdout.\nReferences:\n\nhttp://docs.python.org/reference/simple_stmts.html#the-print-statement\nhttp://docs.python.org/library/sys.html\nhttp://docs.python.org/library/stdtypes.html#file-objects\n\n"} {"input": "What is the difference between old style and new style classes in Python? What is the difference between old style and new style classes in Python? Is there ever a reason to use old-style classes these days?\n", "output": "From http://docs.python.org/2/reference/datamodel.html#new-style-and-classic-classes :\n\nUp to Python 2.1, old-style classes were the only flavour available to the user. The concept of (old-style) class is unrelated to the concept of type: if x is an instance of an old-style class, then x.__class__ designates the class of x, but type(x) is always . This reflects the fact that all old-style instances, independently of their class, are implemented with a single built-in type, called instance. \nNew-style classes were introduced in Python 2.2 to unify the concepts of class and type. A new-style class is simply a user-defined type, no more, no less. If x is an instance of a new-style class, then type(x) is typically the same as x.__class__ (although this is not guaranteed \u00e2\u0080\u0093 a new-style class instance is permitted to override the value returned for x.__class__).\nThe major motivation for introducing new-style classes is to provide a unified object model with a full meta-model. It also has a number of immediate benefits, like the ability to subclass most built-in types, or the introduction of \"descriptors\", which enable computed properties. \nFor compatibility reasons, classes are still old-style by default. New-style classes are created by specifying another new-style class (i.e. a type) as a parent class, or the \"top-level type\" object if no other parent is needed. The behaviour of new-style classes differs from that of old-style classes in a number of important details in addition to what type returns. Some of these changes are fundamental to the new object model, like the way special methods are invoked. Others are \"fixes\" that could not be implemented before for compatibility concerns, like the method resolution order in case of multiple inheritance. \nPython 3 only has new-style classes. No matter if you subclass from object or not, classes are new-style in Python 3. It is however recommended that you still subclass from object.\n\n"} {"input": "Extracting extension from filename in Python Is there a function to extract the extension from a filename?\n", "output": "Yes. Use os.path.splitext:\n>>> import os\n>>> filename, file_extension = os.path.splitext('/path/to/somefile.ext')\n>>> filename\n'/path/to/somefile'\n>>> file_extension\n'.ext'\n\n"} {"input": "Proper way to declare custom exceptions in modern Python? What's the proper way to declare custom exception classes in modern Python? My primary goal is to follow whatever standard other exception classes have, so that (for instance) any extra string I include in the exception is printed out by whatever tool caught the exception.\nBy \"modern Python\" I mean something that will run in Python 2.5 but be 'correct' for the Python 2.6 and Python 3.* way of doing things. And by \"custom\" I mean an Exception object that can include extra data about the cause of the error: a string, maybe also some other arbitrary object relevant to the exception.\nI was tripped up by the following deprecation warning in Python 2.6.2:\n>>> class MyError(Exception):\n... def __init__(self, message):\n... self.message = message\n... \n>>> MyError(\"foo\")\n_sandbox.py:3: DeprecationWarning: BaseException.message has been deprecated as of Python 2.6\n\nIt seems crazy that BaseException has a special meaning for attributes named message. I gather from PEP-352 that attribute did have a special meaning in 2.5 they're trying to deprecate away, so I guess that name (and that one alone) is now forbidden? Ugh.\nI'm also fuzzily aware that Exception has some magic parameter args, but I've never known how to use it. Nor am I sure it's the right way to do things going forward; a lot of the discussion I found online suggested they were trying to do away with args in Python 3.\nUpdate: two answers have suggested overriding __init__, and __str__/__unicode__/__repr__. That seems like a lot of typing, is it necessary?\n", "output": "Maybe I missed the question, but why not:\nclass MyException(Exception):\n pass\n\nEdit: to override something (or pass extra args), do this:\nclass ValidationError(Exception):\n def __init__(self, message, errors):\n\n # Call the base class constructor with the parameters it needs\n super(ValidationError, self).__init__(message)\n\n # Now for your custom code...\n self.errors = errors\n\nThat way you could pass dict of error messages to the second param, and get to it later with e.errors\n"} {"input": "Why does comparing strings in Python using either '==' or 'is' sometimes produce a different result? I've got a Python program where two variables are set to the value 'public'. In a conditional expression I have the comparison var1 is var2 which fails, but if I change it to var1 == var2 it returns True.\nNow if I open my Python interpreter and do the same \"is\" comparison, it succeeds.\n>>> s1 = 'public'\n>>> s2 = 'public'\n>>> s2 is s1\nTrue\n\nWhat am I missing here?\n", "output": "is is identity testing, == is equality testing. what happens in your code would be emulated in the interpreter like this:\n>>> a = 'pub'\n>>> b = ''.join(['p', 'u', 'b'])\n>>> a == b\nTrue\n>>> a is b\nFalse\n\nso, no wonder they're not the same, right?\nIn other words: is is the id(a) == id(b)\n"} {"input": "How can I count the occurrences of a list item in Python? Given an item, how can I count its occurrences in a list in Python?\n", "output": "If you are using Python 2.7 or 3 and you want number of occurrences for each element:\n>>> from collections import Counter\n>>> z = ['blue', 'red', 'blue', 'yellow', 'blue', 'red']\n>>> Counter(z)\nCounter({'blue': 3, 'red': 2, 'yellow': 1})\n\n"} {"input": "What's the canonical way to check for type in python? What is the best way to check whether a given object is of a given type? How about checking whether the object inherits from a given type?\nLet's say I have an object o. How do I check whether it's a str?\n", "output": "To check if the type of o is exactly str:\ntype(o) is str\n\nTo check if o is an instance of str or any subclass of str (this would be the \"canonical\" way):\nisinstance(o, str)\n\nThe following also works, and can be useful in some cases:\nissubclass(type(o), str)\ntype(o) in ([str] + str.__subclasses__())\n\nSee Built-in Functions in the Python Library Reference for relevant information.\nOne more note: in this case, you may actually want to use:\nisinstance(o, basestring)\n\nbecause this will also catch Unicode strings (unicode is not a subclass of str; both str and unicode are subclasses of basestring).\nAlternatively, isinstance accepts a tuple of classes. This will return True if x is an instance of any subclass of any of (str, unicode):\nisinstance(o, (str, unicode))\n\n"} {"input": "Differences between isinstance() and type() in python What are the differences between these two code fragments? Which way is considered to be more pythonic?\nUsing type():\nimport types\n\nif type(a) is types.DictType:\n do_something()\nif type(b) in types.StringTypes:\n do_something_else()\n\nUsing isinstance():\nif isinstance(a, dict):\n do_something()\nif isinstance(b, str) or isinstance(b, unicode):\n do_something_else()\n\nEdit: This seems to be discussed already: link.\n", "output": "To summarize the contents of other (already good!) answers, isinstance caters for inheritance (an instance of a derived class is an instance of a base class, too), while checking for equality of type does not (it demands identity of types and rejects instances of subtypes, AKA subclasses).\nNormally, in Python, you want your code to support inheritance, of course (since inheritance is so handy, it would be bad to stop code using yours from using it!), so isinstance is less bad than checking identity of types because it seamlessly supports inheritance.\nIt's not that isinstance is good, mind you\u00e2\u0080\u0094it's just less bad than checking equality of types. The normal, Pythonic, preferred solution is almost invariably \"duck typing\": try using the argument as if it was of a certain desired type, do it in a try/except statement catching all exceptions that could arise if the argument was not in fact of that type (or any other type nicely duck-mimicking it;-), and in the except clause, try something else (using the argument \"as if\" it was of some other type).\nbasestring is, however, quite a special case\u00e2\u0080\u0094a builtin type that exists only to let you use isinstance (both str and Unicode subclass basestring). Strings are sequences (you could loop over them, index them, slice them, ...), but you generally want to treat them as \"scalar\" types\u00e2\u0080\u0094it's somewhat incovenient (but a reasonably frequent use case) to treat all kinds of strings (and maybe other scalar types, i.e., ones you can't loop on) one way, all containers (lists, sets, dicts, ...) in another way, and basestring plus isinstance helps you do that\u00e2\u0080\u0094the overall structure of this idiom is something like:\nif isinstance(x, basestring)\n return treatasscalar(x)\ntry:\n return treatasiter(iter(x))\nexcept TypeError:\n return treatasscalar(x)\n\nYou could say that basestring is an Abstract Base Class (\"ABC\")\u00e2\u0080\u0094it offers no concrete functionality to subclasses, but rather exists as a \"marker\", mainly for use with isinstance. The concept is obviously a growing one in Python, since PEP 3119, which introduces a generalization of it, was accepted and has been implemented starting with Python 2.6 and 3.0.\nThe PEP makes it clear that, while ABCs can often substitute for duck typing, there is generally no big pressure to do that (see here). ABCs as implemented in recent Python versions do however offer extra goodies: isinstance (and issubclass) can now mean more than just \"[an instance of] a derived class\" (in particular, any class can be \"registered\" with an ABC so that it will show as a subclass, and its instances as instances of the ABC); and ABCs can also offer extra convenience to actual subclasses in a very natural way via Template Method design pattern applications (see here and here [[part II]] for more on the TM DP, in general and specifically in Python, independent of ABCs).\nFor the underlying mechanics of ABC support as offered in Python 2.6, see here; for their 3.1 version, very similar, see here. In both versions, standard library module collections (that's the 3.1 version\u00e2\u0080\u0094for the very similar 2.6 version, see here) offers several useful ABCs.\nFor the purpose of this answer, the key thing to retain about ABCs (beyond an arguably more natural placement for TM DP functionality, compared to the classic Python alternative of mixin classes such as UserDict.DictMixin) is that they make isinstance (and issubclass) much more attractive and pervasive (in Python 2.6 and going forward) than they used to be (in 2.5 and before), and therefore, by contrast, make checking type equality an even worse practice in recent Python versions than it already used to be.\n"} {"input": "What is the purpose of self in Python? What is the purpose of the self word in Python? I understand it refers to the specific object created from that class, but I can't see why it explicitly needs to be added to every function as a parameter. To illustrate, in Ruby I can do this:\nclass myClass\n def myFunc(name)\n @name = name\n end\nend\n\nWhich I understand, quite easily. However in Python I need to include self:\nclass myClass:\n def myFunc(self, name):\n self.name = name\n\nCan anyone talk me through this? It is not something I've come across in my (admittedly limited) experience.\n", "output": "The reason you need to use self. is because Python does not use the @ syntax to refer to instance attributes. Python decided to do methods in a way that makes the instance to which the method belongs be passed automatically, but not received automatically: the first parameter of methods is the instance the method is called on. That makes methods entirely the same as functions, and leaves the actual name to use up to you (although self is the convention, and people will generally frown at you when you use something else.) self is not special to the code, it's just another object.\nPython could have done something else to distinguish normal names from attributes -- special syntax like Ruby has, or requiring declarations like C++ and Java do, or perhaps something yet more different -- but it didn't. Python's all for making things explicit, making it obvious what's what, and although it doesn't do it entirely everywhere, it does do it for instance attributes. That's why assigning to an instance attribute needs to know what instance to assign to, and that's why it needs self..\n"} {"input": "How do I remove packages installed with Python's easy_install? Python's easy_install makes installing new packages extremely convenient. However, as far as I can tell, it doesn't implement the other common features of a dependency manager - listing and removing installed packages.\nWhat is the best way of finding out what's installed, and what is the preferred way of removing installed packages? Are there any files that need to be updated if I remove packages manually (e.g. by rm /usr/local/lib/python2.6/dist-packages/my_installed_pkg.egg or similar)?\n", "output": "pip, an alternative to setuptools/easy_install, provides an \"uninstall\" command. \nInstall pip according to the installation instructions:\n$ wget https://bootstrap.pypa.io/get-pip.py\n$ python get-pip.py\n\nThen you can use pip uninstall to remove packages installed with easy_install\n"} {"input": "In Python, how do I determine if an object is iterable? Is there a method like isiterable? The only solution I have found so far is to call\nhasattr(myObj, '__iter__')\n\nBut I am not sure how fool-proof this is.\n", "output": "\nChecking for __iter__ works on sequence types, but it would fail on e.g. strings. I would like to know the right answer too, until then, here is one possibility (which would work on strings, too):\ntry:\n some_object_iterator = iter(some_object)\nexcept TypeError, te:\n print some_object, 'is not iterable'\n\nThe iter built-in checks for the __iter__ method or in the case of strings the __getitem__ method. To check if an object is \"list like\" and not \"string like\" then the key is the attributes __getitem__ and __iter__:\n\n In [9]: hasattr([1,2,3,4], '__iter__')\n Out[9]: True\n In [11]: hasattr((1,2,3,4), '__iter__')\n Out[11]: True\n In [12]: hasattr(u\"hello\", '__iter__')\n Out[12]: False\n In [14]: hasattr(u\"hello\", '__getitem__')\n Out[14]: True\n\n\nAnother general pythonic approach is to assume an iterable, then fail gracefully if it does not work on the given object. The Python glossary:\n\nPythonic programming style that determines an object's type by inspection of its method or attribute signature rather than by explicit relationship to some type object (\"If it looks like a duck and quacks like a duck, it must be a duck.\") By emphasizing interfaces rather than specific types, well-designed code improves its flexibility by allowing polymorphic substitution. Duck-typing avoids tests using type() or isinstance(). Instead, it typically employs the EAFP (Easier to Ask Forgiveness than Permission) style of programming.\n...\ntry:\n _ = (e for e in my_object)\nexcept TypeError:\n print my_object, 'is not iterable'\n\n\nThe collections module provides some abstract base classes, which allow to ask classes or instances if they provide particular functionality, for example:\nimport collections\n\nif isinstance(e, collections.Iterable):\n # e is iterable\n\n\n"} {"input": "Why do people write #!/usr/bin/env python on the first line of a Python script? It seems to me like the files run the same without that line.\n", "output": "If you have several versions of Python installed, /usr/bin/env will ensure the interpreter used is the first one on your environment's $PATH. The alternative would be to hardcode something like #!/usr/bin/python; that's ok, but less flexible.\nIn Unix, an executable file that's meant to be interpreted can indicate what interpreter to use by having a #! at the start of the first line, followed by the interpreter (and any flags it may need).\nIf you're talking about other platforms, of course, this rule does not apply (but that \"shebang line\" does no harm, and will help if you ever copy that script to a platform with a Unix base, such as Linux, Mac, etc).\n"} {"input": "What's the difference between list and tuples? What's the difference?\nWhat are the advantages / disadvantages of tuples / lists?\n", "output": "Apart from tuples being immutable there is also a semantic distinction that should guide their usage. Tuples are heterogeneous data structures (i.e., their entries have different meanings), while lists are homogeneous sequences. Tuples have structure, lists have order. \nUsing this distinction makes code more explicit and understandable.\nOne example would be pairs of page and line number to reference locations in a book, e.g.:\nmy_location = (42, 11) # page number, line number\n\nYou can then use this as a key in a dictionary to store notes on locations. A list on the other hand could be used to store multiple locations. Naturally one might want to add or remove locations from the list, so it makes sense that lists are mutable. On the other hand it doesn't make sense to add or remove items from an existing location - hence tuples are immutable.\nThere might be situations where you want to change items within an existing location tuple, for example when iterating through the lines of a page. But tuple immutability forces you to create a new location tuple for each new value. This seems inconvenient on the face of it, but using immutable data like this is a cornerstone of value types and functional programming techniques, which can have substantial advantages.\nThere are some interesting articles on this issue, e.g. \"Python Tuples are Not Just Constant Lists\" or \"Understanding tuples vs. lists in Python\". The official Python documentation also mentions this (\"Tuples are immutable, and usually contain an heterogeneous sequence ...\").\nIn a statically typed language like Haskell the values in a tuple generally have different types and the length of the tuple must be fixed. In a list the values all have the same type and the length is not fixed. So the difference is very obvious.\nFinally there is the namedtuple in Python, which makes sense because a tuple is already supposed to have structure. This underlines the idea that tuples are a light-weight alternative to classes and instances.\n"} {"input": "Import a module from a relative path How do I import a python module given its relative path?\nFor example, if dirFoo contains Foo.py and dirBar, and dirBar contains Bar.py, how do I import Bar.py into Foo.py?\nHere's a visual representation:\ndirFoo\\\n Foo.py\n dirBar\\\n Bar.py\n\nFoo wishes to include Bar, but restructuring the folder hierarchy is not an option.\n", "output": "Be sure that dirBar has the __init__.py file -- this makes a directory into a Python package.\n"} {"input": "ASCII value of a character in Python How do I get the ASCII value of a character as an int in Python?\n", "output": "From here:\n\nfunction ord() would get the int value\n of the char. And in case you want to\n convert back after playing with the\n number, function chr() does the trick.\n\n>>> ord('a')\n97\n>>> chr(97)\n'a'\n>>> chr(ord('a') + 3)\n'd'\n>>>\n\nThere is also the unichr function, returning the Unicode character whose ordinal is the unichr argument:\n>>> unichr(97)\nu'a'\n>>> unichr(1234)\nu'\\u04d2'\n\n"} {"input": "What is a mixin, and why are they useful? In \"Programming Python\", Mark Lutz mentions \"mixins\". I'm from a C/C++/C# background and I have not heard the term before. What is a mixin? \nReading between the lines of this example (which I've linked to because it's quite long), I'm presuming it's a case of using multiple inheritance to extend a class as opposed to 'proper' subclassing. Is this right? \nWhy would I want to do that rather than put the new functionality into a subclass? For that matter, why would a mixin/multiple inheritance approach be better than using composition?\nWhat separates a mixin from multiple inheritance? Is it just a matter of semantics?\n", "output": "A mixin is a special kind of multiple inheritance. There are two main situations where mixins are used:\n\nYou want to provide a lot of optional features for a class.\nYou want to use one particular feature in a lot of different classes.\n\nFor an example of number one, consider werkzeug's request and response system. I can make a plain old request object by saying:\nfrom werkzeug import BaseRequest\n\nclass Request(BaseRequest):\n pass\n\nIf I want to add accept header support, I would make that\nfrom werkzeug import BaseRequest, AcceptMixin\n\nclass Request(BaseRequest, AcceptMixin):\n pass\n\nIf I wanted to make a request object that supports accept headers, etags, authentication, and user agent support, I could do this:\nfrom werkzeug import BaseRequest, AcceptMixin, ETagRequestMixin, UserAgentMixin, AuthorizationMixin\n\nclass Request(BaseRequest, AcceptMixin, ETagRequestMixin, UserAgentMixin, AuthorizationMixin):\n pass\n\nThe difference is subtle, but in the above examples, the mixin classes weren't made to stand on their own. In more traditional multiple inheritance, The AuthenticationMixin (for example) would probably be something more like Authenticator. That is, the class would probably be designed to stand on its own.\n"} {"input": "Convert bytes to a Python string I'm using this code to get standard output from an external program:\n>>> from subprocess import *\n>>> command_stdout = Popen(['ls', '-l'], stdout=PIPE).communicate()[0]\n\nThe communicate() method returns an array of bytes:\n>>> command_stdout\nb'total 0\\n-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file1\\n-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file2\\n'\n\nHowever, I'd like to work with the output as a normal Python string. So that I could print it like this:\n>>> print(command_stdout)\n-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file1\n-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file2\n\nI thought that's what the binascii.b2a_qp() method is for, but when I tried it, I got the same byte array again:\n>>> binascii.b2a_qp(command_stdout)\nb'total 0\\n-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file1\\n-rw-rw-r-- 1 thomas thomas 0 Mar 3 07:03 file2\\n'\n\nDoes anybody know how to convert the bytes value back to string? I mean, using the \"batteries\" instead of doing it manually. And I'd like it to be ok with Python 3.\n", "output": "You need to decode the bytes object to produce a string:\n>>> b\"abcde\"\nb'abcde'\n\n# utf-8 is used here because it is a very common encoding, but you\n# need to use the encoding your data is actually in.\n>>> b\"abcde\".decode(\"utf-8\") \n'abcde'\n\n"} {"input": "Most elegant way to check if the string is empty in Python? Does Python have something like an empty string variable where you can do?:\nif myString == string.empty:\n\nRegardless what's the most elegant way to check for empty string values? I find hard coding \"\" every time for checking an empty string not as good.\n", "output": "Empty strings are \"falsy\" which means they are considered false in a Boolean context, so you can just do this:\nif not myString:\n\nThis is the preferred way if you know that your variable is a string. If your variable could also be some other type then you should use myString == \"\". See the documentation on Truth Value Testing for other values that are false in Boolean contexts.\n"} {"input": "How can I do a line break (line continuation) in Python? I have a long line of code that I want to break up among multiple lines. What do I use and what is the syntax?\nFor example, adding a bunch of strings,\ne = 'a' + 'b' + 'c' + 'd'\n\nand have it like this:\ne = 'a' + 'b' +\n 'c' + 'd'\n\n", "output": "What is the line? You can just have arguments on the next line without any problems:\na = dostuff(blahblah1, blahblah2, blahblah3, blahblah4, blahblah5, \n blahblah6, blahblah7)\n\nOtherwise you can do something like this:\nif a == True and \\\n b == False\n\nCheck the style guide for more information.\nFrom your example line:\na = '1' + '2' + '3' + \\\n '4' + '5'\n\nOr:\na = ('1' + '2' + '3' +\n '4' + '5')\n\nNote that the style guide says that using the implicit continuation with parentheses is preferred, but in this particular case just adding parentheses around your expression is probably the wrong way to go.\n"} {"input": "How can I get a list of locally installed Python modules? I would like to get a list of Python modules, which are in my Python installation (UNIX server).\nHow can you get a list of Python modules installed in your computer?\n", "output": "help('modules')\n\nin a Python shell/prompt.\n"} {"input": "How to use threading in Python? I am trying to understand threading in Python. I've looked at the documentation and examples, but quite frankly, many examples are overly sophisticated and I'm having trouble understanding them.\nHow do you clearly show tasks being divided for multi-threading?\n", "output": "Here's a simple example: you need to try a few alternative URLs and return the contents of the first one to respond.\nimport Queue\nimport threading\nimport urllib2\n\n# called by each thread\ndef get_url(q, url):\n q.put(urllib2.urlopen(url).read())\n\ntheurls = [\"http://google.com\", \"http://yahoo.com\"]\n\nq = Queue.Queue()\n\nfor u in theurls:\n t = threading.Thread(target=get_url, args = (q,u))\n t.daemon = True\n t.start()\n\ns = q.get()\nprint s\n\nThis is a case where threading is used as a simple optimization: each subthread is waiting for a URL to resolve and respond, in order to put its contents on the queue; each thread is a daemon (won't keep the process up if main thread ends -- that's more common than not); the main thread starts all subthreads, does a get on the queue to wait until one of them has done a put, then emits the results and terminates (which takes down any subthreads that might still be running, since they're daemon threads).\nProper use of threads in Python is invariably connected to I/O operations (since CPython doesn't use multiple cores to run CPU-bound tasks anyway, the only reason for threading is not blocking the process while there's a wait for some I/O). Queues are almost invariably the best way to farm out work to threads and/or collect the work's results, by the way, and they're intrinsically threadsafe so they save you from worrying about locks, conditions, events, semaphores, and other inter-thread coordination/communication concepts.\n"} {"input": "How to remove a key from a python dictionary? When trying to delete a key from a dictionary, I write:\nif 'key' in myDict:\n del myDict['key']\n\nIs there a one line way of doing this?\n", "output": "Use dict.pop():\nmy_dict.pop('key', None)\n\nThis will return my_dict[key] if key exists in the dictionary, and None otherwise. If the second parameter is not specified (ie. my_dict.pop('key')) and key does not exist, a KeyError is raised.\n"} {"input": "How do I check what version of Python is running my script? How can I check what version of the Python Interpreter is interpreting my script?\n", "output": "This information is available in the sys.version string in the sys module:\n>>> import sys\n\nHuman readable:\n>>> print (sys.version) #parentheses necessary in python 3. \n2.5.2 (r252:60911, Jul 31 2008, 17:28:52) \n[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)]\n\nFor further processing:\n>>> sys.version_info\n(2, 5, 2, 'final', 0)\n# or\n>>> sys.hexversion\n34014192\n\nTo ensure a script runs with a minimal version requirement of the Python interpreter add this to your code:\nassert sys.version_info >= (2,5)\n\nThis compares major and minor version information. Add micro (=0, 1, etc) and even releaselevel (='alpha','final', etc) to the tuple as you like. Note however, that it is almost always better to \"duck\" check if a certain feature is there, and if not, workaround (or bail out). Sometimes features go away in newer releases, being replaced by others.\n"} {"input": "How do I parse XML in Python? I have many rows in a database that contains xml and I'm trying to write a Python script that will go through those rows and count how many instances of a particular node attribute show up. For instance, my tree looks like:\n\n \n \n \n \n\n\nHow can I access the attributes 1 and 2 in the XML using Python?\n", "output": "I suggest ElementTree. There are other compatible implementations of the same API, such as lxml, and cElementTree in the Python standard library itself; but, in this context, what they chiefly add is even more speed -- the ease of programming part depends on the API, which ElementTree defines.\nAfter building an Element instance e from the XML, e.g. with the XML function, or by parsing a file with something like\nimport xml.etree.ElementTree\ne = xml.etree.ElementTree.parse('thefile.xml').getroot()\n\nor any of the many other ways shown at ElementTree, you just do something like:\nfor atype in e.findall('type'):\n print(atype.get('foobar'))\n\nand similar, usually pretty simple, code patterns.\n"} {"input": "How do you change the size of figures drawn with matplotlib? How do you change the size of figure drawn with matplotlib?\n", "output": "figure tells you the call signature:\nfigure(num=None, figsize=(8, 6), dpi=80, facecolor='w', edgecolor='k')\n\nSo figure(figsize=(1,1)) creates an inch-by-inch image, which will be 80-by-80 pixels unless you also give a different dpi argument.\n"} {"input": "Parsing values from a JSON file in Python I have this JSON in a file:\n{\n \"maps\": [\n {\n \"id\": \"blabla\",\n \"iscategorical\": \"0\"\n },\n {\n \"id\": \"blabla\",\n \"iscategorical\": \"0\"\n }\n ],\n \"masks\": [\n \"id\": \"valore\"\n ],\n \"om_points\": \"value\",\n \"parameters\": [\n \"id\": \"valore\"\n ]\n}\n\nI wrote this script which prints all of the json text:\njson_data=open(file_directory).read()\n\ndata = json.loads(json_data)\npprint(data)\n\nHow can I parse the file and extract single values?\n", "output": "I think what Ignacio is saying is that your JSON file is incorrect. You have [] when you should have {}. [] are for lists, {} are for dictionaries.\nHere's how your JSON file should look, your JSON file wouldn't even load for me:\n{\n \"maps\": [\n {\n \"id\": \"blabla\",\n \"iscategorical\": \"0\"\n },\n {\n \"id\": \"blabla\",\n \"iscategorical\": \"0\"\n }\n ],\n \"masks\": {\n \"id\": \"valore\"\n },\n \"om_points\": \"value\",\n \"parameters\": {\n \"id\": \"valore\"\n }\n}\n\nThen you can use your code:\nimport json\nfrom pprint import pprint\n\nwith open('data.json') as data_file: \n data = json.load(data_file)\n\npprint(data)\n\nWith data, you can now also find values like so:\ndata[\"maps\"][0][\"id\"]\ndata[\"masks\"][\"id\"]\ndata[\"om_points\"]\n\nTry those out and see if it starts to make sense.\n"} {"input": "How can I print a literal \"{}\" characters in python string and also use .format on it? x = \" \\{ Hello \\} {0} \"\nprint x.format(42)\n\ngives me : Key Error: Hello\\\\\nI want to print the output: {Hello} 42\n", "output": "You need to double the {{ and }}:\n>>> x = \" {{ Hello }} {0} \"\n>>> print x.format(42)\n' { Hello } 42 '\n\nHere's the relevant part of the Python documentation for format string syntax:\n\nFormat strings contain \u00e2\u0080\u009creplacement fields\u00e2\u0080\u009d surrounded by curly braces {}. Anything that is not contained in braces is considered literal text, which is copied unchanged to the output. If you need to include a brace character in the literal text, it can be escaped by doubling: {{ and }}.\n\n"} {"input": "If Python is interpreted, what are .pyc files? I've been given to understand that Python is an interpreted language... However, when I look at my Python source code I see .pyc files, which Windows identifies as \"Compiled Python Files\". Where do these come in?\n", "output": "\nI've been given to understand that\n Python is an interpreted language...\n\nThis popular meme is incorrect, or, rather, constructed upon a misunderstanding of (natural) language levels: a similar mistake would be to say \"the Bible is a hardcover book\". Let me explain that simile...\n\"The Bible\" is \"a book\" in the sense of being a class of (actual, physical objects identified as) books; the books identified as \"copies of the Bible\" are supposed to have something fundamental in common (the contents, although even those can be in different languages, with different acceptable translations, levels of footnotes and other annotations) -- however, those books are perfectly well allowed to differ in a myriad of aspects that are not considered fundamental -- kind of binding, color of binding, font(s) used in the printing, illustrations if any, wide writable margins or not, numbers and kinds of builtin bookmarks, and so on, and so forth.\nIt's quite possible that a typical printing of the Bible would indeed be in hardcover binding -- after all, it's a book that's typically meant to be read over and over, bookmarked at several places, thumbed through looking for given chapter-and-verse pointers, etc, etc, and a good hardcover binding can make a given copy last longer under such use. However, these are mundane (practical) issues that cannot be used to determine whether a given actual book object is a copy of the Bible or not: paperback printings are perfectly possible!\nSimilarly, Python is \"a language\" in the sense of defining a class of language implementations which must all be similar in some fundamental respects (syntax, most semantics except those parts of those where they're explicitly allowed to differ) but are fully allowed to differ in just about every \"implementation\" detail -- including how they deal with the source files they're given, whether they compile the sources to some lower level forms (and, if so, which form -- and whether they save such compiled forms, to disk or elsewhere), how they execute said forms, and so forth.\nThe classical implementation, CPython, is often called just \"Python\" for short -- but it's just one of several production-quality implementations, side by side with Microsoft's IronPython (which compiles to CLR codes, i.e., \".NET\"), Jython (which compiles to JVM codes), PyPy (which is written in Python itself and can compile to a huge variety of \"back-end\" forms including \"just-in-time\" generated machine language). They're all Python (==\"implementations of the Python language\") just like many superficially different book objects can all be Bibles (==\"copies of The Bible\").\nIf you're interested in CPython specifically: it compiles the source files into a Python-specific lower-level form (known as \"bytecode\"), does so automatically when needed (when there is no bytecode file corresponding to a source file, or the bytecode file is older than the source or compiled by a different Python version), usually saves the bytecode files to disk (to avoid recompiling them in the future). OTOH IronPython will typically compile to CLR codes (saving them to disk or not, depending) and Jython to JVM codes (saving them to disk or not -- it will use the .class extension if it does save them).\nThese lower level forms are then executed by appropriate \"virtual machines\" also known as \"interpreters\" -- the CPython VM, the .Net runtime, the Java VM (aka JVM), as appropriate.\nSo, in this sense (what do typical implementations do), Python is an \"interpreted language\" if and only if C# and Java are: all of them have a typical implementation strategy of producing bytecode first, then executing it via a VM/interpreter.\nMore likely the focus is on how \"heavy\", slow, and high-ceremony the compilation process is. CPython is designed to compile as fast as possible, as lightweight as possible, with as little ceremony as feasible -- the compiler does very little error checking and optimization, so it can run fast and in small amounts of memory, which in turns lets it be run automatically and transparently whenever needed, without the user even needing to be aware that there is a compilation going on, most of the time. Java and C# typically accept more work during compilation (and therefore don't perform automatic compilation) in order to check errors more thoroughly and perform more optimizations. It's a continuum of gray scales, not a black or white situation, and it would be utterly arbitrary to put a threshold at some given level and say that only above that level you call it \"compilation\"!-)\n"} {"input": "Use different Python version with virtualenv I have a Debian system currently running with python 2.5.4. I got virtualenv properly installed, everything is working fine. Is there a possibility that I can use a virtualenv with a different version of Python?\nI compiled Python 2.6.2 and would like to use it with some virtualenv. Is it enough to overwrite the binary file? Or do I have to change something in respect to the libraries?\n", "output": "Just use the -p flag when creating your virtualenv instance to specify the Python executable you want to use, e.g.:\nvirtualenv -p /usr/bin/python2.6 \n\n"} {"input": "How to import a module given the full path? How can I load a Python module given its full path? Note that the file can be anywhere in the filesystem, as it is a configuration option.\n", "output": "For Python 3.5+ use:\nimport importlib.util\nspec = importlib.util.spec_from_file_location(\"module.name\", \"/path/to/file.py\")\nfoo = importlib.util.module_from_spec(spec)\nspec.loader.exec_module(foo)\nfoo.MyClass()\n\nFor Python 3.3 and 3.4 use:\nfrom importlib.machinery import SourceFileLoader\n\nfoo = SourceFileLoader(\"module.name\", \"/path/to/file.py\").load_module()\nfoo.MyClass()\n\n(Although this has been deprecated in Python 3.4.)\nPython 2 use:\nimport imp\n\nfoo = imp.load_source('module.name', '/path/to/file.py')\nfoo.MyClass()\n\nThere are equivalent convenience functions for compiled Python files and DLLs.\nSee also. http://bugs.python.org/issue21436.\n"} {"input": "Python @property versus getters and setters Here is a pure Python-specific design question: \nclass MyClass(object):\n ...\n def get_my_attr(self):\n ...\n\n def set_my_attr(self, value):\n ...\n\nand \nclass MyClass(object):\n ... \n @property\n def my_attr(self):\n ...\n\n @my_attr.setter\n def my_attr(self, value):\n ...\n\nPython lets us to do it either way. If you would design a Python program, which approach would you use and why?\n", "output": "Prefer properties. It's what they're there for.\nThe reason is that all attributes are public in Python. Starting names with an underscore or two is just a warning that the given attribute is an implementation detail that may not stay the same in future versions of the code. It doesn't prevent you from actually getting or setting that attribute. Therefore, standard attribute access is the normal, Pythonic way of, well, accessing attributes.\nThe advantage of properties is that they are syntactically identical to attribute access, so you can change from one to another without any changes to client code. You could even have one version of a class that uses properties (say, for code-by-contract or debugging) and one that doesn't for production, without changing the code that uses it. At the same time, you don't have to write getters and setters for everything just in case you might need to better control access later.\n"} {"input": "How do I download a file over HTTP using Python? I have a small utility that I use to download a MP3 from a website on a schedule and then builds/updates a podcast XML file which I've obviously added to iTunes.\nThe text processing that creates/updates the XML file is written in Python. I use wget inside a Windows .bat file to download the actual MP3 however. I would prefer to have the entire utility written in Python though.\nI struggled though to find a way to actually down load the file in Python, thus why I resorted to wget.\nSo, how do I download the file using Python?\n", "output": "One more, using urlretrieve:\nimport urllib\nurllib.urlretrieve (\"http://www.example.com/songs/mp3.mp3\", \"mp3.mp3\")\n\n(for Python 3+ use 'import urllib.request' and urllib.request.urlretrieve)\nYet another one, with a \"progressbar\"\nimport urllib2\n\nurl = \"http://download.thinkbroadband.com/10MB.zip\"\n\nfile_name = url.split('/')[-1]\nu = urllib2.urlopen(url)\nf = open(file_name, 'wb')\nmeta = u.info()\nfile_size = int(meta.getheaders(\"Content-Length\")[0])\nprint \"Downloading: %s Bytes: %s\" % (file_name, file_size)\n\nfile_size_dl = 0\nblock_sz = 8192\nwhile True:\n buffer = u.read(block_sz)\n if not buffer:\n break\n\n file_size_dl += len(buffer)\n f.write(buffer)\n status = r\"%10d [%3.2f%%]\" % (file_size_dl, file_size_dl * 100. / file_size)\n status = status + chr(8)*(len(status)+1)\n print status,\n\nf.close()\n\n"} {"input": "Way to create multiline comments? I have recently started studying Python, but I couldn't find how to implement multi-line comments. Most languages have block comment symbols like \n/* \n\n*/\n\nI tried this in Python, but it throws an error, so this probably is not the correct way. Does Python actually have a multiline comment feature?\n", "output": "You can use triple-quoted strings. When they're not a docstring (first thing in a class/function/module), they are ignored.\n'''\nThis is a multiline\ncomment.\n'''\n\nGuido van Rossum (creator of Python) tweeted this as a \"pro tip\".\nHowever, Python's style guide, PEP8, favors using consecutive single-line comments, and this is also what you'll find in many projects. Editors usually have a shortcut to do this easily.\n"} {"input": "python setup.py uninstall I have installed a python package with python setup.py install.\nHow do I uninstall it?\n", "output": "You need to remove all files manually, and also undo any other stuff that installation did manually.\nIf you don't know the list of all files, you can reinstall it with the --record option, and take a look at the list this produces.\nTo record list of installed files, you can use:\npython setup.py install --record files.txt\n\nOnce you want to uninstall you can use xargs to do the removal:\ncat files.txt | xargs rm -rf\n\n"} {"input": "How do I remove/delete a folder that is not empty with Python? I am getting an 'access is denied' error when I attempt to delete a folder that is not empty. I used the following command in my attempt: os.remove(\"/folder_name\"). \nWhat is the most effective way of removing/deleting a folder/directory that is not empty?\n", "output": "import shutil\n\nshutil.rmtree('/folder_name')\n\nStandard Library Reference: shutil.rmtree.\n"} {"input": "Why shouldn't I use PyPy over CPython if PyPy is 6.3 times faster? I've been hearing a lot about the PyPy project. They claim it is 6.3 times faster than the CPython interpreter on their site.\nWhenever we talk about dynamic languages like Python, speed is one of the top issues. To solve this, they say PyPy is 6.3 times faster.\nThe second issue is parallelism, the infamous Global Interpreter Lock (GIL). For this, PyPy says it can give GIL-less Python.\nIf PyPy can solve these great challenges, what are its weaknesses that are preventing wider adoption? That is to say, what's preventing someone like me, a typical Python developer, from switching to PyPy right now? \n", "output": "PyPy, as others have been quick to mention, has tenuous support for C extensions. It has support, but typically at slower-than-Python speeds and it's iffy at best. Hence a lot of modules simply require CPython. Cython and Numpy are awesome for numerics, and most people who actually need speed in Python are using those (+ Pandas, SciPy, etc.) heavily. Since they're either non-existent or tenuously supported and slow the people who need a fast Python often are better off with CPython both for speed and ease-of-use.\nSecondly, Python 3 support is experimental at the moment. has just reached stable! As of 20th June 2014, PyPy3 2.3.1 - Fulcrum is out!\nThirdly, PyPy sometimes isn't actually faster for \"scripts\", which a lot of people use Python for. These are the short-running programs that do something simple and small. Because PyPy is a JIT compiler its main advantages come from long run times and simple types (such as numbers). Frankly, PyPy's pre-JIT speeds are pretty bad compared to CPython.\nFourthly, inertia. Moving to PyPy often requires retooling, which for some people and organizations is simply too much work.\nThose are the main reasons that affect me, I'd say.\n"} {"input": "Correct way to write line to file in Python I'm used to doing print >>f, \"hi there\"\nHowever, it seems that print >> is getting deprecated. What is the recommended way to do the line above?\nUpdate:\nRegarding all those answers with \"\\n\"...is this universal or Unix-specific? IE, should I be doing \"\\r\\n\" on Windows?\n", "output": "You should use the new print() statement, available with Python 2.6+\nfrom __future__ import print_function\nprint(\"hi there\", file=f)\n\nThe alternative would be to use:\nf = open('myfile','w')\nf.write('hi there\\n') # python will convert \\n to os.linesep\nf.close() # you can omit in most cases as the destructor will call it\n\nQuoting from Python documentation regarding newlines:\n\nOn output, if newline is None, any '\\n' characters written are translated to the system default line separator, os.linesep. If newline is '', no translation takes place. If newline is any of the other legal values, any '\\n' characters written are translated to the given string.\n\n"} {"input": "UnicodeEncodeError: 'ascii' codec can't encode character u'\\xa0' in position 20: ordinal not in range(128) I'm having problems dealing with unicode characters from text fetched from different web pages (on different sites). I am using BeautifulSoup. \nThe problem is that the error is not always reproducible; it sometimes works with some pages, and sometimes, it barfs by throwing a UnicodeEncodeError. I have tried just about everything I can think of, and yet I have not found anything that works consistently without throwing some kind of Unicode-related error.\nOne of the sections of code that is causing problems is shown below:\nagent_telno = agent.find('div', 'agent_contact_number')\nagent_telno = '' if agent_telno is None else agent_telno.contents[0]\np.agent_info = str(agent_contact + ' ' + agent_telno).strip()\n\nHere is a stack trace produced on SOME strings when the snippet above is run:\nTraceback (most recent call last):\n File \"foobar.py\", line 792, in \n p.agent_info = str(agent_contact + ' ' + agent_telno).strip()\nUnicodeEncodeError: 'ascii' codec can't encode character u'\\xa0' in position 20: ordinal not in range(128)\n\nI suspect that this is because some pages (or more specifically, pages from some of the sites) may be encoded, whilst others may be unencoded. All the sites are based in the UK and provide data meant for UK consumption - so there are no issues relating to internalization or dealing with text written in anything other than English.\nDoes anyone have any ideas as to how to solve this so that I can CONSISTENTLY fix this problem?\n", "output": "You need to read the Python Unicode HOWTO. This error is the very first example.\nBasically, stop using str to convert from unicode to encoded text / bytes.\nInstead, properly use .encode() to encode the string:\np.agent_info = u' '.join((agent_contact, agent_telno)).encode('utf-8').strip()\n\nor work entirely in unicode.\n"} {"input": "How to get file creation & modification date/times in Python? I have a script that needs to do some stuff based on file creation & modification dates but has to run on Linux & Windows.\nWhat's the best cross-platform way to get file creation & modification date/times in Python?\n", "output": "You have a couple of choices. For one, you can use the os.path.getmtime and os.path.getctime functions:\nimport os.path, time\nprint \"last modified: %s\" % time.ctime(os.path.getmtime(file))\nprint \"created: %s\" % time.ctime(os.path.getctime(file))\n\nYour other option is to use os.stat:\nimport os, time\n(mode, ino, dev, nlink, uid, gid, size, atime, mtime, ctime) = os.stat(file)\nprint \"last modified: %s\" % time.ctime(mtime)\n\nNote: ctime() does not refer to creation time on *nix systems, but rather the last time the inode data changed. (thanks to kojiro for making that fact more clear in the comments by providing a link to an interesting blog post)\n"} {"input": "Installing specific package versions with pip I'm trying to install version 1.2.2 of the MySQL_python adaptor. The current version shown in PyPi is 1.2.3. Is there a way to install the older version? I found an article stating that this should do it:\npip install MySQL_python==1.2.2\n\nWhen installed, however, it still shows MySQL_python-1.2.3-py2.6.egg-info in the site packages. Is this a problem specific to this package, or am I doing something wrong?\n", "output": "First, I see two issues with what you're trying to do. Since you already have an installed version, you should either uninstall the current existing driver or use pip install -I MySQL_python==1.2.2\nHowever, you'll soon find out that this doesn't work. If you look at pip's installation log, or if you do a pip install -Iv MySQL_python==1.2.2 you'll find that the PyPI URL link does not work for MySQL_python v1.2.2. You can verify this here: http://pypi.python.org/pypi/MySQL-python/1.2.2\nThe download link 404s and the fallback URL links are re-directing infinitely due to sourceforge.net's recent upgrade and PyPI's stale URL.\nSo to properly install the driver, you can follow these steps:\npip uninstall MySQL_python\npip install -Iv http://sourceforge.net/projects/mysql-python/files/mysql-python/1.2.2/MySQL-python-1.2.2.tar.gz/download\n\n"} {"input": "What are \"named tuples\" in Python? Reading the changes in Python 3.1, I found something... unexpected:\n\nThe sys.version_info tuple is now a named tuple:\n\nI never heard about named tuples before, and I thought elements could either be indexed by numbers (like in tuples and lists) or by keys (like in dicts). I never expected they could be indexed both ways.\nThus, my questions are:\n\nWhat are named tuples?\nHow to use them?\nWhy/when should I use named tuples instead of normal tuples?\nWhy/when should I use normal tuples instead of named tuples?\nIs there any kind of \"named list\" (a mutable version of the named tuple)?\n\n", "output": "Named tuples are basically easy to create, lightweight object types. Named tuple instances can be referenced using object like variable deferencing or the standard tuple syntax. They can be used similarly to struct or other common record types, except that they are immutable. They were added in Python 2.6 and Python 3.0, although there is a recipe for implementation in Python 2.4.\nFor example, it is common to represent a point, for example as a tuple (x, y). This leads to code like the following:\npt1 = (1.0, 5.0)\npt2 = (2.5, 1.5)\n\nfrom math import sqrt\nline_length = sqrt((pt1[0]-pt2[0])**2 + (pt1[1]-pt2[1])**2)\n\nUsing a named tuple it becomes more readable:\nfrom collections import namedtuple\nPoint = namedtuple('Point', 'x y')\npt1 = Point(1.0, 5.0)\npt2 = Point(2.5, 1.5)\n\nfrom math import sqrt\nline_length = sqrt((pt1.x-pt2.x)**2 + (pt1.y-pt2.y)**2)\n\nHowever, named tuples are still backwards compatible with normal tuples, so the following will still work:\nPoint = namedtuple('Point', 'x y')\npt1 = Point(1.0, 5.0)\npt2 = Point(2.5, 1.5)\n\nfrom math import sqrt\n# use index referencing\nline_length = sqrt((pt1[0]-pt2[0])**2 + (pt1[1]-pt2[1])**2)\n # use tuple unpacking\nx1, y1 = pt1\n\nThus, you should use named tuples instead of tuples anywhere you think object notation will make your code more pythonic and more easily readable. I personally have started using them to represent very simple value types, particularly when passing them as parameters to functions. It makes the functions more readable, without seeing the context of the tuple packing.\nFurthermore, you can also replace ordinary immutable classes that have no functions, only fields with them. You can even use your named tuple types as base classes:\nclass Point(namedtuple('Point', 'x y')):\n [...]\n\nHowever, as with tuples, attributes in named tuples are immutable:\n>>> Point = namedtuple('Point', 'x y')\n>>> pt1 = Point(1.0, 5.0)\n>>> pt1.x = 2.0\nAttributeError: can't set attribute\n\nIf you want to be able change the values, you need another type. There is a handy recipe for mutable recordtypes which allow you to set new values to attributes.\n>>> from rcdtype import *\n>>> Point = recordtype('Point', 'x y')\n>>> pt1 = Point(1.0, 5.0)\n>>> pt1 = Point(1.0, 5.0)\n>>> pt1.x = 2.0\n>>> print(pt1[0])\n 2.0\n\nI am not aware of any form of \"named list\" that lets you add new fields, however. You may just want to use a dictionary in this situation. Named tuples can be converted to dictionaries using pt1._asdict() which returns {'x': 1.0, 'y': 5.0} and can be operated upon with all the usual dictionary functions. \nAs already noted, you should check the documentation for more information from which these examples were constructed.\n"} {"input": "Speed comparison with Project Euler: C vs Python vs Erlang vs Haskell I have taken Problem #12 from Project Euler as a programming exercise and to compare my (surely not optimal) implementations in C, Python, Erlang and Haskell. In order to get some higher execution times, I search for the first triangle number with more than 1000 divisors instead of 500 as stated in the original problem.\nThe result is the following:\nC:\nlorenzo@enzo:~/erlang$ gcc -lm -o euler12.bin euler12.c\nlorenzo@enzo:~/erlang$ time ./euler12.bin\n842161320\n\nreal 0m11.074s\nuser 0m11.070s\nsys 0m0.000s\n\npython:\nlorenzo@enzo:~/erlang$ time ./euler12.py \n842161320\n\nreal 1m16.632s\nuser 1m16.370s\nsys 0m0.250s\n\npython with pypy:\nlorenzo@enzo:~/Downloads/pypy-c-jit-43780-b590cf6de419-linux64/bin$ time ./pypy /home/lorenzo/erlang/euler12.py \n842161320\n\nreal 0m13.082s\nuser 0m13.050s\nsys 0m0.020s\n\nerlang:\nlorenzo@enzo:~/erlang$ erlc euler12.erl \nlorenzo@enzo:~/erlang$ time erl -s euler12 solve\nErlang R13B03 (erts-5.7.4) [source] [64-bit] [smp:4:4] [rq:4] [async-threads:0] [hipe] [kernel-poll:false]\n\nEshell V5.7.4 (abort with ^G)\n1> 842161320\n\nreal 0m48.259s\nuser 0m48.070s\nsys 0m0.020s\n\nhaskell:\nlorenzo@enzo:~/erlang$ ghc euler12.hs -o euler12.hsx\n[1 of 1] Compiling Main ( euler12.hs, euler12.o )\nLinking euler12.hsx ...\nlorenzo@enzo:~/erlang$ time ./euler12.hsx \n842161320\n\nreal 2m37.326s\nuser 2m37.240s\nsys 0m0.080s\n\nSummary:\n\nC: 100%\npython: 692% (118% with pypy)\nerlang: 436% (135% thanks to RichardC)\nhaskell: 1421%\n\nI suppose that C has a big advantage as it uses long for the calculations and not arbitrary length integers as the other three. Also it doesn't need to load a runtime first (Do the others?).\nQuestion 1:\nDo Erlang, Python and Haskell lose speed due to using arbitrary length integers or don't they as long as the values are less than MAXINT?\nQuestion 2:\nWhy is Haskell so slow? Is there a compiler flag that turns off the brakes or is it my implementation? (The latter is quite probable as Haskell is a book with seven seals to me.)\nQuestion 3:\nCan you offer me some hints how to optimize these implementations without changing the way I determine the factors? Optimization in any way: nicer, faster, more \"native\" to the language.\nEDIT:\nQuestion 4:\nDo my functional implementations permit LCO (last call optimization, a.k.a tail recursion elimination) and hence avoid adding unnecessary frames onto the call stack?\nI really tried to implement the same algorithm as similar as possible in the four languages, although I have to admit that my Haskell and Erlang knowledge is very limited.\n\nSource codes used:\n#include \n#include \n\nint factorCount (long n)\n{\n double square = sqrt (n);\n int isquare = (int) square;\n int count = isquare == square ? -1 : 0;\n long candidate;\n for (candidate = 1; candidate <= isquare; candidate ++)\n if (0 == n % candidate) count += 2;\n return count;\n}\n\nint main ()\n{\n long triangle = 1;\n int index = 1;\n while (factorCount (triangle) < 1001)\n {\n index ++;\n triangle += index;\n }\n printf (\"%ld\\n\", triangle);\n}\n\n\n#! /usr/bin/env python3.2\n\nimport math\n\ndef factorCount (n):\n square = math.sqrt (n)\n isquare = int (square)\n count = -1 if isquare == square else 0\n for candidate in range (1, isquare + 1):\n if not n % candidate: count += 2\n return count\n\ntriangle = 1\nindex = 1\nwhile factorCount (triangle) < 1001:\n index += 1\n triangle += index\n\nprint (triangle)\n\n\n-module (euler12).\n-compile (export_all).\n\nfactorCount (Number) -> factorCount (Number, math:sqrt (Number), 1, 0).\n\nfactorCount (_, Sqrt, Candidate, Count) when Candidate > Sqrt -> Count;\n\nfactorCount (_, Sqrt, Candidate, Count) when Candidate == Sqrt -> Count + 1;\n\nfactorCount (Number, Sqrt, Candidate, Count) ->\n case Number rem Candidate of\n 0 -> factorCount (Number, Sqrt, Candidate + 1, Count + 2);\n _ -> factorCount (Number, Sqrt, Candidate + 1, Count)\n end.\n\nnextTriangle (Index, Triangle) ->\n Count = factorCount (Triangle),\n if\n Count > 1000 -> Triangle;\n true -> nextTriangle (Index + 1, Triangle + Index + 1) \n end.\n\nsolve () ->\n io:format (\"~p~n\", [nextTriangle (1, 1) ] ),\n halt (0).\n\n\nfactorCount number = factorCount' number isquare 1 0 - (fromEnum $ square == fromIntegral isquare)\n where square = sqrt $ fromIntegral number\n isquare = floor square\n\nfactorCount' number sqrt candidate count\n | fromIntegral candidate > sqrt = count\n | number `mod` candidate == 0 = factorCount' number sqrt (candidate + 1) (count + 2)\n | otherwise = factorCount' number sqrt (candidate + 1) count\n\nnextTriangle index triangle\n | factorCount triangle > 1000 = triangle\n | otherwise = nextTriangle (index + 1) (triangle + index + 1)\n\nmain = print $ nextTriangle 1 1\n\n", "output": "Using GHC 7.0.3, gcc 4.4.6, Linux 2.6.29 on an x86_64 Core2 Duo (2.5GHz) machine, compiling using ghc -O2 -fllvm -fforce-recomp for Haskell and gcc -O3 -lm for C.\n\nYour C routine runs in 8.4 seconds (faster than your run probably because of -O3)\nThe Haskell solution runs in 36 seconds (due to the -O2 flag)\nYour factorCount' code isn't explicitly typed and defaulting to Integer (thanks to Daniel for correcting my misdiagnosis here!). Giving an explicit type signature (which is standard practice anyway) using Int and the time changes to 11.1 seconds\nin factorCount' you have needlessly called fromIntegral. A fix results in no change though (the compiler is smart, lucky for you).\nYou used mod where rem is faster and sufficient. This changes the time to 8.5 seconds.\nfactorCount' is constantly applying two extra arguments that never change (number, sqrt). A worker/wrapper transformation gives us:\n\n $ time ./so\n 842161320 \n\n real 0m7.954s \n user 0m7.944s \n sys 0m0.004s \n\nThat's right, 7.95 seconds. Consistently half a second faster than the C solution. Without the -fllvm flag I'm still getting 8.182 seconds, so the NCG backend is doing well in this case too.\nConclusion: Haskell is awesome.\nResulting Code\nfactorCount number = factorCount' number isquare 1 0 - (fromEnum $ square == fromIntegral isquare)\n where square = sqrt $ fromIntegral number\n isquare = floor square\n\nfactorCount' :: Int -> Int -> Int -> Int -> Int\nfactorCount' number sqrt candidate0 count0 = go candidate0 count0\n where\n go candidate count\n | candidate > sqrt = count\n | number `rem` candidate == 0 = go (candidate + 1) (count + 2)\n | otherwise = go (candidate + 1) count\n\nnextTriangle index triangle\n | factorCount triangle > 1000 = triangle\n | otherwise = nextTriangle (index + 1) (triangle + index + 1)\n\nmain = print $ nextTriangle 1 1\n\nEDIT: So now that we've explored that, lets address the questions\n\nQuestion 1: Do erlang, python and haskell lose speed due to using\n arbitrary length integers or don't they as long as the values are less\n than MAXINT?\n\nIn Haskell, using Integer is slower than Int but how much slower depends on the computations performed. Luckily (for 64 bit machines) Int is sufficient. For portability sake you should probably rewrite my code to use Int64 or Word64 (C isn't the only language with a long).\n\nQuestion 2: Why is haskell so slow? Is there a compiler flag that\n turns off the brakes or is it my implementation? (The latter is quite\n probable as haskell is a book with seven seals to me.)\nQuestion 3: Can you offer me some hints how to optimize these\n implementations without changing the way I determine the factors?\n Optimization in any way: nicer, faster, more \"native\" to the language.\n\nThat was what I answered above. The answer was \n\n0) Use optimization via -O2 \n1) Use fast (notably: unbox-able) types when possible \n2) rem not mod (a frequently forgotten optimization) and \n3) worker/wrapper transformation (perhaps the most common optimization).\n\n\nQuestion 4: Do my functional implementations permit LCO and hence\n avoid adding unnecessary frames onto the call stack?\n\nYes, that wasn't the issue. Good work and glad you considered this.\n"} {"input": "'has_key()' or 'in'? I wonder what is better to do:\nd = {'a': 1, 'b': 2}\n'a' in d\nTrue\n\nor:\nd = {'a': 1, 'b': 2}\nd.has_key('a')\nTrue\n\n", "output": "in is definitely more pythonic.\nIn fact has_key() was removed in Python 3.x.\n"} {"input": "Why is [] faster than list()? I recently compared the processing speeds of [] and list() and was surprised to discover that [] runs more than three times faster than list(). I ran the same test with {} and dict() and the results were practically identical: [] and {} both took around 0.128sec / million cycles, while list() and dict() took roughly 0.428sec / million cycles each.\nWhy is this? Do [] and {} (probably () and '', too) immediately pass back a copies of some empty stock literal while their explicitly-named counterparts (list(), dict(), tuple(), str()) fully go about creating an object, whether or not they actually have elements?\nI have no idea how these two methods differ but I'd love to find out.\nI couldn't find an answer in the docs or on SO, and searching for empty brackets turned out to be more complicated than I'd expected.\nI got my timing results by calling timeit.timeit(\"[]\") and timeit.timeit(\"list()\"), and timeit.timeit(\"{}\") and timeit.timeit(\"dict()\"), to compare lists and dictionaries, respectively. I'm running Python 2.7.9.\nI recently discovered \"Why is if True slower than if 1?\" that compares the performance of if True to if 1 and seems to touch on a similar literal-versus-global scenario; perhaps it's worth considering as well.\n", "output": "Because [] and {} are literal syntax. Python can create bytecode just to create the list or dictionary objects:\n>>> import dis\n>>> dis.dis(compile('[]', '', 'eval'))\n 1 0 BUILD_LIST 0\n 3 RETURN_VALUE \n>>> dis.dis(compile('{}', '', 'eval'))\n 1 0 BUILD_MAP 0\n 3 RETURN_VALUE \n\nlist() and dict() are separate objects. Their names need to be resolved, the stack has to be involved to push the arguments, the frame has to be stored to retrieve later, and a call has to be made. That all takes more time.\nFor the empty case, that means you have at the very least a LOAD_NAME (which has to search through the global namespace as well as the __builtin__ module) followed by a CALL_FUNCTION, which has to preserve the current frame:\n>>> dis.dis(compile('list()', '', 'eval'))\n 1 0 LOAD_NAME 0 (list)\n 3 CALL_FUNCTION 0\n 6 RETURN_VALUE \n>>> dis.dis(compile('dict()', '', 'eval'))\n 1 0 LOAD_NAME 0 (dict)\n 3 CALL_FUNCTION 0\n 6 RETURN_VALUE \n\nYou can time the name lookup separately with timeit:\n>>> import timeit\n>>> timeit.timeit('list', number=10**7)\n0.30749011039733887\n>>> timeit.timeit('dict', number=10**7)\n0.4215109348297119\n\nThe time discrepancy there is probably a dictionary hash collision. Subtract those times from the times for calling those objects, and compare the result against the times for using literals:\n>>> timeit.timeit('[]', number=10**7)\n0.30478692054748535\n>>> timeit.timeit('{}', number=10**7)\n0.31482696533203125\n>>> timeit.timeit('list()', number=10**7)\n0.9991960525512695\n>>> timeit.timeit('dict()', number=10**7)\n1.0200958251953125\n\nSo having to call the object takes an additional 1.00 - 0.31 - 0.30 == 0.39 seconds per 10 million calls.\nYou can avoid the global lookup cost by aliasing the global names as locals (using a timeit setup, everything you bind to a name is a local):\n>>> timeit.timeit('_list', '_list = list', number=10**7)\n0.1866450309753418\n>>> timeit.timeit('_dict', '_dict = dict', number=10**7)\n0.19016098976135254\n>>> timeit.timeit('_list()', '_list = list', number=10**7)\n0.841480016708374\n>>> timeit.timeit('_dict()', '_dict = dict', number=10**7)\n0.7233691215515137\n\nbut you never can overcome that CALL_FUNCTION cost.\n"} {"input": "How to get line count cheaply in Python? I need to get a line count of a large file (hundreds of thousands of lines) in python. What is the most efficient way both memory- and time-wise?\nAt the moment I do:\ndef file_len(fname):\n with open(fname) as f:\n for i, l in enumerate(f):\n pass\n return i + 1\n\nis it possible to do any better?\n", "output": "One line, probably pretty fast:\nnum_lines = sum(1 for line in open('myfile.txt'))\n\n"} {"input": "null object in Python? How do I refer to the null object in Python?\n", "output": "In Python, the 'null' object is the singleton None.\nThe best way to check things for \"Noneness\" is to use the identity operator, is:\nif foo is None:\n ...\n\n"} {"input": "Terminating a Python script I am aware of the die() command in PHP which stops a script early.\nHow can I do this in Python?\n", "output": "import sys\nsys.exit()\n\ndetails from the sys module documentation:\n\nsys.exit([arg])\n\nExit from Python. This is implemented by raising the\n SystemExit exception, so cleanup actions specified by finally clauses\n of try statements are honored, and it is possible to intercept the\n exit attempt at an outer level.\nThe optional argument arg can be an integer giving the exit status\n (defaulting to zero), or another type of object. If it is an integer,\n zero is considered \u00e2\u0080\u009csuccessful termination\u00e2\u0080\u009d and any nonzero value is\n considered \u00e2\u0080\u009cabnormal termination\u00e2\u0080\u009d by shells and the like. Most systems\n require it to be in the range 0-127, and produce undefined results\n otherwise. Some systems have a convention for assigning specific\n meanings to specific exit codes, but these are generally\n underdeveloped; Unix programs generally use 2 for command line syntax\n errors and 1 for all other kind of errors. If another type of object\n is passed, None is equivalent to passing zero, and any other object is\n printed to stderr and results in an exit code of 1. In particular,\n sys.exit(\"some error message\") is a quick way to exit a program when\n an error occurs.\nSince exit() ultimately \u00e2\u0080\u009conly\u00e2\u0080\u009d raises an exception, it will only exit\n the process when called from the main thread, and the exception is not\n intercepted.\n\n\nNote that this is the 'nice' way to exit. @glyphtwistedmatrix below points out that if you want a 'hard exit', you can use os._exit(errorcode), though it's likely os-specific to some extent (it might not take an errorcode under windows, for example), and it definitely is less friendly since it doesn't let the interpreter do any cleanup before the process dies.\n"} {"input": "Differences between distribute, distutils, setuptools and distutils2? The Situation\nI\u00e2\u0080\u0099m trying to port an open-source library to Python 3. (SymPy, if anyone is wondering.) \nSo, I need to run 2to3 automatically when building for Python 3. To do that, I need to use distribute. Therefore, I need to port the current system, which (according to the doctest) is distutils. \n\nThe Problem\nUnfortunately, I\u00e2\u0080\u0099m not sure what\u00e2\u0080\u0099s the difference between these modules\u00e2\u0080\u0094distutils, distribute, setuptools. The documentation is sketchy as best, as they all seem to be a fork of one another, intended to be compatible in most circumstances (but actually, not all)\u00e2\u0080\u00a6and so on, and so forth. \n\nThe Question\nCould someone explain the differences? What am I supposed to use? What is the most modern solution? (As an aside, I\u00e2\u0080\u0099d also appreciate some guide on porting to Distribute, but that\u00e2\u0080\u0099s a tad beyond the scope of the question\u00e2\u0080\u00a6)\n", "output": "As of September 2014, all of the other answers to this question are a year out-of-date. When you come across advice on Python packaging issues, remember to look at the date of publication, and don't trust out-of-date information.\nThe Python Packaging User Guide hosted on Readthedocs is worth a read. Every page has a \"last reviewed\" date displayed, so you can check the recency of the manual, and it's quite comprehensive. The official Python 3.4 docs now link to this resource, just to add credence to it.\nSummary of tools:\nHere's a summary of the Python packaging landscape in September 2014:\n\nDistutils is still the standard tool for packaging in Python. It is included in the standard library (Python 2 and Python 3.0 to 3.4). It is useful for simple Python distributions, but lacks features. It introduces the distutils Python package that can be imported in your setup.py script.\nSetuptools was developed to overcome Distutils' limitations, and is not included in the standard library. It introduced a command-line utility called easy_install. It also introduced the setuptools Python package that can be imported in your setup.py script, and the pkg_resources Python package that can be imported in your code to locate data files installed with a distribution. One of its gotchas is that it monkey-patches the distutils Python package. It should work well with pip. It sees regular releases.\nDistribute was a fork of Setuptools. It shared the same namespace, so if you had Distribute installed, import setuptools would actually import the package distributed with Distribute. Distribute was merged back into Setuptools 0.7, so you don't need to use Distribute any more. In fact, the version on Pypi is just a compatibility layer that installs Setuptools.\nDistutils2 was an attempt to take the best of Distutils, Setuptools and Distribute and become the standard tool included in Python's standard library. The idea was that Distutils2 would be distributed for old Python versions, and that Distutils2 would be renamed to packaging for Python 3.3, which would include it in its standard library. These plans did not go as intended, however, and currently, Distutils2 is an abandoned project. The latest release was in March 2012, and its Pypi home page has finally been updated to reflect its death.\nDistlib is a tool that aims to implement a subset of the previous tools' functionality, but only functionality that is very well-defined in accepted PEPs. It should hopefully be included eventually in the Python standard library. It is still being developed and is not recommended for end-users yet.\nBento is a packaging solution designed to replace Distutils, Setuptools, Distribute and Distutils2, written from the ground up. Its primary developer is also a core developer of numpy/scipy, so he's familiar with non-simple use-cases for packaging systems. Its first commit was in October 2009, and the latest commit as of writing was in August 2014, although the authors are not updating its Pypi page correspondingly. It's in active development but it is not mature yet, and it is not as widely known as Setuptools yet.\n\nRecommendation:\nSo in conclusion, out of all these options, I would recommend Setuptools, unless your requirements are very basic and you only need Distutils. Setuptools works very well with Virtualenv and Pip, tools that I highly recommend.\nAs a side-note, I recommend using Virtualenv 1.10 or higher, as it is the first release that recognises the Setuptools/Distribute merger, for both Python 2 and 3.\n"} {"input": "How to find if directory exists in Python In the os module in Python, is there a way to find if a directory exists, something like:\n>>> os.direxists(os.path.join(os.getcwd()), 'new_folder')) # in pseudocode\nTrue/False\n\n", "output": "You're looking for os.path.isdir, or os.path.exists if you don't care whether it's a file or a directory.\nExample:\nimport os\nprint(os.path.isdir(\"/home/el\"))\nprint(os.path.exists(\"/home/el/myfile.txt\"))\n\n"} {"input": "pip install mysql-python fails with EnvironmentError: mysql_config not found This is the error I get\n(mysite)zjm1126@zjm1126-G41MT-S2:~/zjm_test/mysite$ pip install mysql-python\nDownloading/unpacking mysql-python\n Downloading MySQL-python-1.2.3.tar.gz (70Kb): 70Kb downloaded\n Running setup.py egg_info for package mysql-python\n sh: mysql_config: not found\n Traceback (most recent call last):\n File \"\", line 14, in \n File \"/home/zjm1126/zjm_test/mysite/build/mysql-python/setup.py\", line 15, in \n metadata, options = get_config()\n File \"setup_posix.py\", line 43, in get_config\n libs = mysql_config(\"libs_r\")\n File \"setup_posix.py\", line 24, in mysql_config\n raise EnvironmentError(\"%s not found\" % (mysql_config.path,))\n EnvironmentError: mysql_config not found\n Complete output from command python setup.py egg_info:\n sh: mysql_config: not found\n\nTraceback (most recent call last):\n\n File \"\", line 14, in \n\n File \"/home/zjm1126/zjm_test/mysite/build/mysql-python/setup.py\", line 15, in \n\n metadata, options = get_config()\n\n File \"setup_posix.py\", line 43, in get_config\n\n libs = mysql_config(\"libs_r\")\n\n File \"setup_posix.py\", line 24, in mysql_config\n\n raise EnvironmentError(\"%s not found\" % (mysql_config.path,))\n\nEnvironmentError: mysql_config not found\n\n----------------------------------------\nCommand python setup.py egg_info failed with error code 1\nStoring complete log in /home/zjm1126/.pip/pip.log\n(mysite)zjm1126@zjm1126-G41MT-S2:~/zjm_test/mysite$ pip install mysql-python\nDownloading/unpacking mysql-python\n Running setup.py egg_info for package mysql-python\n sh: mysql_config: not found\n Traceback (most recent call last):\n File \"\", line 14, in \n File \"/home/zjm1126/zjm_test/mysite/build/mysql-python/setup.py\", line 15, in \n metadata, options = get_config()\n File \"setup_posix.py\", line 43, in get_config\n libs = mysql_config(\"libs_r\")\n File \"setup_posix.py\", line 24, in mysql_config\n raise EnvironmentError(\"%s not found\" % (mysql_config.path,))\n EnvironmentError: mysql_config not found\n Complete output from command python setup.py egg_info:\n sh: mysql_config: not found\n\nTraceback (most recent call last):\n\n File \"\", line 14, in \n\n File \"/home/zjm1126/zjm_test/mysite/build/mysql-python/setup.py\", line 15, in \n\n metadata, options = get_config()\n\n File \"setup_posix.py\", line 43, in get_config\n\n libs = mysql_config(\"libs_r\")\n\n File \"setup_posix.py\", line 24, in mysql_config\n\n raise EnvironmentError(\"%s not found\" % (mysql_config.path,))\n\nEnvironmentError: mysql_config not found\n\n----------------------------------------\nCommand python setup.py egg_info failed with error code 1\nStoring complete log in /home/zjm1126/.pip/pip.log\n\nWhat can I do to resolve this?\n", "output": "It seems mysql_config is missing on your system or the installer could not find it.\nBe sure mysql_config is really installed.\nFor example on Debian/Ubuntu you must install the package:\nsudo apt-get install libmysqlclient-dev\n\nMaybe the mysql_config is not in your path, it will be the case when you compile by yourself\nthe mysql suite.\n"} {"input": "How can I force division to be floating point in Python? Division keeps rounding down to 0 I have two integer values a and b, but I need their ratio in floating point. I know that a2.5\n if exc.errno == errno.EEXIST and os.path.isdir(path):\n pass\n else:\n raise\n\nUpdate\nFor Python \u00e2\u0089\u00a5 3.2, os.makedirs has an optional third argument exist_ok that, when true, enables the mkdir -p functionality \u00e2\u0080\u0094unless mode is provided and the existing directory has different permissions than the intended ones; in that case, OSError is raised as previously.\n"} {"input": "Map two lists into a dictionary in Python Imagine that you have:\nkeys = ('name', 'age', 'food')\nvalues = ('Monty', 42, 'spam')\n\nWhat is the simplest way to produce the following dictionary ?\ndict = {'name' : 'Monty', 'age' : 42, 'food' : 'spam'}\n\nThis code works, but I'm not really proud of it :\ndict = {}\njunk = map(lambda k, v: dict.update({k: v}), keys, values)\n\n", "output": "Like this:\n>>> keys = ['a', 'b', 'c']\n>>> values = [1, 2, 3]\n>>> dictionary = dict(zip(keys, values))\n>>> print dictionary\n{'a': 1, 'b': 2, 'c': 3}\n\nVoila :-) The pairwise dict constructor and zip function are awesomely useful: https://docs.python.org/2/library/functions.html#func-dict\n"} {"input": "How do you return multiple values in Python? The canonical way to return multiple values in languages that support it is often tupling. \nOption: Using a tuple\nConsider this trivial example:\ndef f(x):\n y0 = x + 1\n y1 = x * 3\n y2 = y0 ** y3\n return (y0,y1,y2)\n\nHowever, this quickly gets problematic as the number of values returned increases. What if you want to return four or five values? Sure, you could keep tupling them, but it gets easy to forget which value is where. It's also rather ugly to unpack them wherever you want to receive them.\nOption: Using a dictionary\nThe next logical step seems to be to introduce some sort of 'record notation'. In python, the obvious way to do this is by means of a dict. \nConsider the following:\ndef g(x):\n y0 = x + 1\n y1 = x * 3\n y2 = y0 ** y3\n return {'y0':y0, 'y1':y1 ,'y2':y2 }\n\n(edit- Just to be clear, y0, y1 and y2 are just meant as abstract identifiers. As pointed out, in practice you'd use meaningful identifiers)\nNow, we have a mechanism whereby we can project out a particular member of the returned object. For example, \nresult['y0']\n\nOption: Using a class\nHowever, there is another option. We could instead return a specialized structure. I've framed this in the context of Python, but I'm sure it applies to other languages as well. Indeed, if you were working in C this might very well be your only option. Here goes:\nclass ReturnValue(object):\n def __init__(self, y0, y1, y2):\n self.y0 = y0\n self.y1 = y1\n self.y2 = y2\n\ndef g(x):\n y0 = x + 1\n y1 = x * 3\n y2 = y0 ** y3\n return ReturnValue(y0, y1, y2)\n\nIn python the previous two are perhaps very similar in terms of plumbing- After all { y0, y1, y2 } just end up being entries in the internal __dict__ of the ReturnValue.\nThere is one additional feature provided by Python though for tiny objects, the __slots__ attribute. The class could be expressed as:\nclass ReturnValue(object):\n __slots__ = [\"y0\", \"y1\", \"y2\"]\n def __init__(self, y0, y1, y2):\n self.y0 = y0\n self.y1 = y1\n self.y2 = y2\n\nFrom the Python Reference Manual:\n\nThe __slots__ declaration takes a sequence of instance variables and reserves just enough space in each instance to hold a value for each variable. Space is saved because __dict__ is not created for each instance.\n\nOption: Using a list\nAnother suggestion which I'd overlooked comes from Bill the Lizard:\ndef h(x):\n result = [x + 1]\n result.append(x * 3)\n result.append(y0 ** y3)\n return result\n\nThis is my least favorite method though. I suppose I'm tainted by exposure to Haskell, but the idea of mixed-type lists has always felt uncomfortable to me. In this particular example the list is -not- mixed type, but it conceivably could be. A list used in this way really doesn't gain anything with respect to the tuple as far as I can tell. The only real difference between lists and tuples in Python is that lists are mutable, wheras tuples are not. I personally tend to carry over the conventions from functional programming: use lists for any number of elements of the same type, and tuples for a fixed number of elements of predetermined types.\nQuestion\nAfter the lengthy preamble, comes the inevitable question. Which method (do you think) is best?\nI've typically found myself going the dictionary route because it involves less set-up work. From a types perspective however, you might be better off going the class route, since that may help you avoid confusing what a dictionary represents. On the other hand, there are some in the Python community that feel implied interfaces should be preferred to explicit interfaces, at which point the type of the object really isn't relevant, since you're basically relying on the convention that the same attribute will always have the same meaning. \nSo, how do -you- return multiple values in Python?\n", "output": "Named tuples were added in 2.6 for this purpose. Also see os.stat for a similar builtin example.\n>>> import collections\n>>> point = collections.namedtuple('Point', ['x', 'y'])\n>>> p = point(1, y=2)\n>>> p.x, p.y\n1 2\n>>> p[0], p[1]\n1 2\n\n"} {"input": "Count occurrence of a character in a string What's the simplest way to count the number of occurrences of a character in a string?\ne.g. count the number of times 'a' appears in 'Mary had a little lamb'\n", "output": "\nstr.count(sub[, start[, end]])\nReturn the number of non-overlapping occurrences of substring sub in the range [start, end]. Optional arguments start and end are interpreted as in slice notation.\n\n>>> sentence = 'Mary had a little lamb'\n>>> sentence.count('a')\n4\n\nOr:\nfrom collections import Counter\nsentence = 'Mary had a little lamb'\nprint(Counter(list(sentence))['a'])\n\n"} {"input": "Delete an element from a dictionary Is there a way to delete an element from a dictionary in Python?\nI know I can just call .pop on the dictionary, but that returns the element that was removed. What I'm looking for is something returns the dictionary minus the element in question. \nAt present I have a helper function that accepts the dictionary in question as parameter, and then returns a dictionary with the element removed, Is there a more elegant solution?\n", "output": "The del statement removes an element:\ndel d[key]\n\nHowever, this mutates the existing dictionary so the contents of the dictionary changes for anybody else who has a reference to the same instance. To return a new dictionary, make a copy of the dictionary:\ndef removekey(d, key):\n r = dict(d)\n del r[key]\n return r\n\nThe dict() constructor makes a shallow copy. To make a deep copy, see the copy module.\n"} {"input": "Generate random integers between 0 and 9 How can I generate random integer numbers between 0 and 9 (inclusive) in Python?\ni.e. 0 1 2 3 4 5 6 7 8 9\n", "output": "Try:\nfrom random import randint\nprint(randint(0,9))\n\nMore info: https://docs.python.org/3.1/library/random.html#random.randint\n"} {"input": "How to create a constant in Python Is there a way to declare a constant in Python. In Java we can create constant in this manner:\npublic static final String CONST_NAME = \"Name\";\n\nWhat is the equivalent of the above Java constant declaration in Python?\n", "output": "No there is not. You cannot declare a variable or value as constant in Python. Just don't change it.\nIf you are in a class, the equivalent would be:\nclass Foo(object):\n CONST_NAME = \"Name\"\n\nif not, it is just\nCONST_NAME = \"Name\"\n\nBut you might want to have a look at the code snippet Constants in Python by Alex Martelli.\n"} {"input": "Why does \"not(True) in [False, True]\" return False? If I do this:\n>>> False in [False, True]\nTrue\n\nThat returns True. Simply because False is in the list.\nBut if I do:\n>>> not(True) in [False, True]\nFalse\n\nThat returns False. Whereas not(True) is equal to False:\n>>> not(True)\nFalse\n\nWhy?\n", "output": "Operator precedence 2.x, 3.x. The precedence of not is lower than that of in. So it is equivalent to:\n>>> not (True in [False, True])\nFalse\n\nThis is what you want:\n>>> (not True) in [False, True]\nTrue\n\n\nAs @Ben points out: It's recommended to never write not(True), prefer not True. The former makes it look like a function call, while not is an operator, not a function.\n"} {"input": "\"Large data\" work flows using pandas I have tried to puzzle out an answer to this question for many months while learning pandas. I use SAS for my day-to-day work and it is great for it's out-of-core support. However, SAS is horrible as a piece of software for numerous other reasons.\nOne day I hope to replace my use of SAS with python and pandas, but I currently lack an out-of-core workflow for large datasets. I'm not talking about \"big data\" that requires a distributed network, but rather files too large to fit in memory but small enough to fit on a hard-drive.\nMy first thought is to use HDFStore to hold large datasets on disk and pull only the pieces I need into dataframes for analysis. Others have mentioned MongoDB as an easier to use alternative. My question is this:\nWhat are some best-practice workflows for accomplishing the following:\n\nLoading flat files into a permanent, on-disk database structure\nQuerying that database to retrieve data to feed into a pandas data structure\nUpdating the database after manipulating pieces in pandas\n\nReal-world examples would be much appreciated, especially from anyone who uses pandas on \"large data\".\nEdit -- an example of how I would like this to work:\n\nIteratively import a large flat-file and store it in a permanent, on-disk database structure. These files are typically too large to fit in memory.\nIn order to use Pandas, I would like to read subsets of this data (usually just a few columns at a time) that can fit in memory.\nI would create new columns by performing various operations on the selected columns.\nI would then have to append these new columns into the database structure.\n\nI am trying to find a best-practice way of performing these steps. Reading links about pandas and pytables it seems that appending a new column could be a problem.\nEdit -- Responding to Jeff's questions specifically:\n\nI am building consumer credit risk models. The kinds of data include phone, SSN and address characteristics; property values; derogatory information like criminal records, bankruptcies, etc... The datasets I use every day have nearly 1,000 to 2,000 fields on average of mixed data types: continuous, nominal and ordinal variables of both numeric and character data. I rarely append rows, but I do perform many operations that create new columns.\nTypical operations involve combining several columns using conditional logic into a new, compound column. For example, if var1 > 2 then newvar = 'A' elif var2 = 4 then newvar = 'B'. The result of these operations is a new column for every record in my dataset.\nFinally, I would like to append these new columns into the on-disk data structure. I would repeat step 2, exploring the data with crosstabs and descriptive statistics trying to find interesting, intuitive relationships to model.\nA typical project file is usually about 1GB. Files are organized into such a manner where a row consists of a record of consumer data. Each row has the same number of columns for every record. This will always be the case.\nIt's pretty rare that I would subset by rows when creating a new column. However, it's pretty common for me to subset on rows when creating reports or generating descriptive statistics. For example, I might want to create a simple frequency for a specific line of business, say Retail credit cards. To do this, I would select only those records where the line of business = retail in addition to whichever columns I want to report on. When creating new columns, however, I would pull all rows of data and only the columns I need for the operations.\nThe modeling process requires that I analyze every column, look for interesting relationships with some outcome variable, and create new compound columns that describe those relationships. The columns that I explore are usually done in small sets. For example, I will focus on a set of say 20 columns just dealing with property values and observe how they relate to defaulting on a loan. Once those are explored and new columns are created, I then move on to another group of columns, say college education, and repeat the process. What I'm doing is creating candidate variables that explain the relationship between my data and some outcome. At the very end of this process, I apply some learning techniques that create an equation out of those compound columns.\n\nIt is rare that I would ever add rows to the dataset. I will nearly always be creating new columns (variables or features in statistics/machine learning parlance).\n", "output": "I routinely use tens of gigabytes of data in just this fashion\ne.g. I have tables on disk that I read via queries, create data and append back.\nIt's worth reading the docs and late in this thread for several suggestions for how to store your data.\nDetails which will affect how you store your data, like:\nGive as much detail as you can; and I can help you develop a structure.\n\nSize of data, # of rows, columns, types of columns; are you appending\nrows, or just columns? \nWhat will typical operations look like. E.g. do a query on columns to select a bunch of rows and specific columns, then do an operation (in-memory), create new columns, save these.\n(Giving a toy example could enable us to offer more specific recommendations.)\nAfter that processing, then what do you do? Is step 2 ad hoc, or repeatable?\nInput flat files: how many, rough total size in Gb. How are these organized e.g. by records? Does each one contains different fields, or do they have some records per file with all of the fields in each file?\nDo you ever select subsets of rows (records) based on criteria (e.g. select the rows with field A > 5)? and then do something, or do you just select fields A, B, C with all of the records (and then do something)?\nDo you 'work on' all of your columns (in groups), or are there a good proportion that you may only use for reports (e.g. you want to keep the data around, but don't need to pull in that column explicity until final results time)?\n\nSolution\nEnsure you have pandas at least 0.10.1 installed.\nRead iterating files chunk-by-chunk and multiple table queries.\nSince pytables is optimized to operate on row-wise (which is what you query on), we will create a table for each group of fields. This way it's easy to select a small group of fields (which will work with a big table, but it's more efficient to do it this way... I think I may be able to fix this limitation in the future... this is more intuitive anyhow):\n(The following is pseudocode.)\nimport numpy as np\nimport pandas as pd\n\n# create a store\nstore = pd.HDFStore('mystore.h5')\n\n# this is the key to your storage:\n# this maps your fields to a specific group, and defines \n# what you want to have as data_columns.\n# you might want to create a nice class wrapping this\n# (as you will want to have this map and its inversion) \ngroup_map = dict(\n A = dict(fields = ['field_1','field_2',.....], dc = ['field_1',....,'field_5']),\n B = dict(fields = ['field_10',...... ], dc = ['field_10']),\n .....\n REPORTING_ONLY = dict(fields = ['field_1000','field_1001',...], dc = []),\n\n)\n\ngroup_map_inverted = dict()\nfor g, v in group_map.items():\n group_map_inverted.update(dict([ (f,g) for f in v['fields'] ]))\n\nReading in the files and creating the storage (essentially doing what append_to_multiple does):\nfor f in files:\n # read in the file, additional options hmay be necessary here\n # the chunksize is not strictly necessary, you may be able to slurp each \n # file into memory in which case just eliminate this part of the loop \n # (you can also change chunksize if necessary)\n for chunk in pd.read_table(f, chunksize=50000):\n # we are going to append to each table by group\n # we are not going to create indexes at this time\n # but we *ARE* going to create (some) data_columns\n\n # figure out the field groupings\n for g, v in group_map.items():\n # create the frame for this group\n frame = chunk.reindex(columns = v['fields'], copy = False) \n\n # append it\n store.append(g, frame, index=False, data_columns = v['dc'])\n\nNow you have all of the tables in the file (actually you could store them in separate files if you wish, you would prob have to add the filename to the group_map, but probably this isn't necessary).\nThis is how you get columns and create new ones:\nframe = store.select(group_that_I_want)\n# you can optionally specify:\n# columns = a list of the columns IN THAT GROUP (if you wanted to\n# select only say 3 out of the 20 columns in this sub-table)\n# and a where clause if you want a subset of the rows\n\n# do calculations on this frame\nnew_frame = cool_function_on_frame(frame)\n\n# to 'add columns', create a new group (you probably want to\n# limit the columns in this new_group to be only NEW ones\n# (e.g. so you don't overlap from the other tables)\n# add this info to the group_map\nstore.append(new_group, new_frame.reindex(columns = new_columns_created, copy = False), data_columns = new_columns_created)\n\nWhen you are ready for post_processing:\n# This may be a bit tricky; and depends what you are actually doing.\n# I may need to modify this function to be a bit more general:\nreport_data = store.select_as_multiple([groups_1,groups_2,.....], where =['field_1>0', 'field_1000=foo'], selector = group_1)\n\nAbout data_columns, you don't actually need to define ANY data_columns; they allow you to sub-select rows based on the column. E.g. something like:\nstore.select(group, where = ['field_1000=foo', 'field_1001>0'])\n\nThey may be most interesting to you in the final report generation stage (essentially a data column is segregated from other columns, which might impact efficiency somewhat if you define a lot).\nYou also might want to:\n\ncreate a function which takes a list of fields, looks up the groups in the groups_map, then selects these and concatenates the results so you get the resulting frame (this is essentially what select_as_multiple does). This way the structure would be pretty transparent to you.\nindexes on certain data columns (makes row-subsetting much faster).\nenable compression.\n\nLet me know when you have questions!\n"} {"input": "Convert hex string to int in Python How do I convert a hex string to an int in Python? \nI may have it as \"0xffff\" or just \"ffff\".\n", "output": "Without the 0x prefix, you need to specify the base explicitly, otherwise there's no way to tell:\nx = int(\"deadbeef\", 16)\n\nWith the 0x prefix, Python can distinguish hex and decimal automatically.\n>>> print int(\"0xdeadbeef\", 0)\n3735928559\n>>> print int(\"10\", 0)\n10\n\n(You must specify 0 as the base in order to invoke this prefix-guessing behavior; omitting the second parameter means to assume base-10. See the comments for more details.)\n"} {"input": "How do you remove duplicates from a list in whilst preserving order? Is there a built-in that removes duplicates from list in Python, whilst preserving order? I know that I can use a set to remove duplicates, but that destroys the original order. I also know that I can roll my own like this:\ndef uniq(input):\n output = []\n for x in input:\n if x not in output:\n output.append(x)\n return output\n\n(Thanks to unwind for that code sample.)\nBut I'd like to avail myself of a built-in or a more Pythonic idiom if possible.\nRelated question: In Python, what is the fastest algorithm for removing duplicates from a list so that all elements are unique while preserving order?\n", "output": "Here you have some alternatives: http://www.peterbe.com/plog/uniqifiers-benchmark\nFastest one:\ndef f7(seq):\n seen = set()\n seen_add = seen.add\n return [x for x in seq if not (x in seen or seen_add(x))]\n\nWhy assign seen.add to seen_add instead of just calling seen.add? Python is a dynamic language, and resolving seen.add each iteration is more costly than resolving a local variable. seen.add could have changed between iterations, and the runtime isn't smart enough to rule that out. To play it safe, it has to check the object each time.\nIf you plan on using this function a lot on the same dataset, perhaps you would be better off with an ordered set: http://code.activestate.com/recipes/528878/\nO(1) insertion, deletion and member-check per operation.\n"} {"input": "Directory listing in Python How do I get a list of all files (and directories) in a given directory in Python?\n", "output": "This is a way to traverse every file and directory in a directory tree:\nimport os\n\nfor dirname, dirnames, filenames in os.walk('.'):\n # print path to all subdirectories first.\n for subdirname in dirnames:\n print(os.path.join(dirname, subdirname))\n\n # print path to all filenames.\n for filename in filenames:\n print(os.path.join(dirname, filename))\n\n # Advanced usage:\n # editing the 'dirnames' list will stop os.walk() from recursing into there.\n if '.git' in dirnames:\n # don't go into any .git directories.\n dirnames.remove('.git')\n\n"} {"input": "Checking whether a variable is an integer or not How do I check whether a variable is an integer?\n", "output": "If you need to do this, do\nisinstance( , int )\n\nunless you are in Python 2.x in which case you want\nisinstance( , ( int, long ) )\n\nDo not use type. It is almost never the right answer in Python, since it blocks all the flexibility of polymorphism. For instance, if you subclass int, your new class should register as an int, which type will not do:\nclass Spam( int ): pass\nx = Spam( 0 )\ntype( x ) == int # False\nisinstance( x, int ) # True\n\nThis adheres to Python's strong polymorphism: you should allow any object that behaves like an int, instead of mandating that it be one.\nBUT\nThe classical Python mentality, though, is that it's easier to ask forgiveness than permission. In other words, don't check whether x is an integer; assume that it is and catch the exception results if it isn't:\ntry:\n x += 1\nexcept TypeError:\n ...\n\nThis mentality is slowly being overtaken by the use of abstract base classes, which let you register exactly what properties your object should have (adding? multiplying? doubling?) by making it inherit from a specially-constructed class. That would be the best solution, since it will permit exactly those objects with the necessary and sufficient attributes, but you will have to read the docs on how to use it.\n"} {"input": "How do I unload (reload) a Python module? I have a long-running Python server and would like to be able to upgrade a service without restarting the server. What's the best way do do this?\nif foo.py has changed:\n unimport foo <-- How do I do this?\n import foo\n myfoo = foo.Foo()\n\n", "output": "You can reload a module when it has already been imported by using the reload builtin function in Python 2:\nimport foo\n\nwhile True:\n # Do some things.\n if is_changed(foo):\n foo = reload(foo)\n\nIn Python 3, reload was moved to the imp module. In 3.4, imp was deprecated in favor of importlib, and reload was added to the latter. When targeting 3 or later, either reference the appropriate module when calling reload or import it.\nI think that this is what you want. Web servers like Django's development server use this so that you can see the effects of your code changes without restarting the server process itself.\nTo quote from the docs:\n\nPython modules\u00e2\u0080\u0099 code is recompiled and\n the module-level code reexecuted,\n defining a new set of objects which\n are bound to names in the module\u00e2\u0080\u0099s\n dictionary. The init function of\n extension modules is not called a\n second time. As with all other objects\n in Python the old objects are only\n reclaimed after their reference counts\n drop to zero. The names in the module\n namespace are updated to point to any\n new or changed objects. Other\n references to the old objects (such as\n names external to the module) are not\n rebound to refer to the new objects\n and must be updated in each namespace\n where they occur if that is desired.\n\nAs you noted in your question, you'll have to reconstruct Foo objects if the Foo class resides in the foo module.\n"} {"input": "Can someone explain __all__ in Python? I have been using Python more and more, and I keep seeing the variable __all__ set in different __init__.py files. Can someone explain what this does?\n", "output": "Linked to, but not explicitly mentioned here, is exactly when __all__ is used. It is a list of strings defining what symbols in a module will be exported when from import * is used on the module.\nFor example, the following code in a foo.py explicitly exports the symbols bar and baz:\n__all__ = ['bar', 'baz']\n\nwaz = 5\nbar = 10\ndef baz(): return 'baz'\n\nThese symbols can then be imported like so:\nfrom foo import *\n\nprint bar\nprint baz\n\n# The following will trigger an exception, as \"waz\" is not exported by the module\nprint waz\n\nIf the __all__ above is commented out, this code will then execute to completion, as the default behaviour of import * is to import all symbols that do not begin with an underscore, from the given namespace.\nReference: https://docs.python.org/3.5/tutorial/modules.html#importing-from-a-package\nNOTE: __all__ affects the from import * behavior only. Members that are not mentioned in __all__ are still accessible from outside the module and can be imported with from import .\n"} {"input": "How do I protect Python code? I am developing a piece of software in Python that will be distributed to my employer's customers. My employer wants to limit the usage of the software with a time restricted license file.\nIf we distribute the .py files or even .pyc files it will be easy to (decompile and) remove the code that checks the license file.\nAnother aspect is that my employer does not want the code to be read by our customers, fearing that the code may be stolen or at least the \"novel ideas\".\nIs there a good way to handle this problem? Preferably with an off-the-shelf solution.\nThe software will run on Linux systems (so I don't think py2exe will do the trick).\n", "output": "\"Is there a good way to handle this problem?\" No. Nothing can be protected against reverse engineering. Even the firmware on DVD machines has been reverse engineered and AACS Encryption key exposed. And that's in spite of the DMCA making that a criminal offense.\nSince no technical method can stop your customers from reading your code, you have to apply ordinary commercial methods.\n\nLicenses. Contracts. Terms and Conditions. This still works even when people can read the code. Note that some of your Python-based components may require that you pay fees before you sell software using those components. Also, some open-source licenses prohibit you from concealing the source or origins of that component. \nOffer significant value. If your stuff is so good -- at a price that is hard to refuse -- there's no incentive to waste time and money reverse engineering anything. Reverse engineering is expensive. Make your product slightly less expensive.\nOffer upgrades and enhancements that make any reverse engineering a bad idea. When the next release breaks their reverse engineering, there's no point. This can be carried to absurd extremes, but you should offer new features that make the next release more valuable than reverse engineering.\nOffer customization at rates so attractive that they'd rather pay you do build and support the enhancements.\nUse a license key which expires. This is cruel, and will give you a bad reputation, but it certainly makes your software stop working. \nOffer it as a web service. SaaS involves no downloads to customers. \n\n"} {"input": "What is the best project structure for a Python application? Imagine that you want to develop a non-trivial end-user desktop (not web) application in Python. What is the best way to structure the project's folder hierarchy?\nDesirable features are ease of maintenance, IDE-friendliness, suitability for source control branching/merging, and easy generation of install packages.\nIn particular:\n\nWhere do you put the source?\nWhere do you put application startup scripts?\nWhere do you put the IDE project cruft?\nWhere do you put the unit/acceptance tests?\nWhere do you put non-Python data such as config files?\nWhere do you put non-Python sources such as C++ for pyd/so binary extension modules?\n\n", "output": "Doesn't too much matter. Whatever makes you happy will work. There aren't a lot of silly rules because Python projects can be simple.\n\n/scripts or /bin for that kind of command-line interface stuff\n/tests for your tests\n/lib for your C-language libraries\n/doc for most documentation\n/apidoc for the Epydoc-generated API docs.\n\nAnd the top-level directory can contain README's, Config's and whatnot.\nThe hard choice is whether or not to use a /src tree. Python doesn't have a distinction between /src, /lib, and /bin like Java or C has.\nSince a top-level /src directory is seen by some as meaningless, your top-level directory can be the top-level architecture of your application.\n\n/foo\n/bar\n/baz\n\nI recommend putting all of this under the \"name-of-my-product\" directory. So, if you're writing an application named quux, the directory that contains all this stuff is named /quux.\nAnother project's PYTHONPATH, then, can include /path/to/quux/foo to reuse the QUUX.foo module. \nIn my case, since I use Komodo Edit, my IDE cuft is a single .KPF file. I actually put that in the top-level /quux directory, and omit adding it to SVN.\n"} {"input": "How do I check if a variable exists? I want to check if a variable exists. Now I'm doing something like this:\ntry:\n myVar\nexcept NameError:\n # Do something.\n\nAre there other ways without exceptions?\n", "output": "To check the existence of a local variable:\nif 'myVar' in locals():\n # myVar exists.\n\nTo check the existence of a global variable:\nif 'myVar' in globals():\n # myVar exists.\n\nTo check if an object has an attribute:\nif hasattr(obj, 'attr_name'):\n # obj.attr_name exists.\n\n"} {"input": "Getting the length of an array in Python In Python, is the following the only way to get the number of elements?\narr.__len__()\n\nIf so, why the strange syntax?\n", "output": "my_list = [1,2,3,4,5]\nlen(my_list)\n\nThe same works for tuples:\nmy_tuple = (1,2,3,4,5)\nlen(my_tuple)\n\nAnd strings, which are really just arrays of characters:\nmy_string = 'hello world'\nlen(my_string)\n\nIt was intentionally done this way so that lists, tuples and other container types didn't all need to explicitly implement a public .length() method, instead you can just check the len() of anything that implements the 'magic' __len__() method.\nSure, this may seem redundant, but length checking implementations can vary considerably, even within the same language. It's not uncommon to see one collection type use a .length() method while another type uses a .length property, while yet another uses .count(). Having a language-level keyword unifies the entry point for all these types. So even objects you may not consider to be lists of elements could still be length-checked. This includes strings, queues, trees, etc.\n"} {"input": "Is there any way to kill a Thread in Python? Is it possible to terminate a running thread without setting/checking any flags/semaphores/etc.?\n", "output": "It is generally a bad pattern to kill a thread abruptly, in Python and in any language. Think of the following cases:\n\nthe thread is holding a critical resource that must be closed properly\nthe thread has created several other threads that must be killed as well.\n\nThe nice way of handling this if you can afford it (if you are managing your own threads) is to have an exit_request flag that each threads checks on regular interval to see if it is time for him to exit.\nFor example:\nimport threading\n\nclass StoppableThread(threading.Thread):\n \"\"\"Thread class with a stop() method. The thread itself has to check\n regularly for the stopped() condition.\"\"\"\n\n def __init__(self):\n super(StoppableThread, self).__init__()\n self._stop = threading.Event()\n\n def stop(self):\n self._stop.set()\n\n def stopped(self):\n return self._stop.isSet()\n\nIn this code, you should call stop() on the thread when you want it to exit, and wait for the thread to exit properly using join(). The thread should check the stop flag at regular intervals.\nThere are cases however when you really need to kill a thread. An example is when you are wrapping an external library that is busy for long calls and you want to interrupt it.\nThe following code allows (with some restrictions) to raise an Exception in a Python thread:\ndef _async_raise(tid, exctype):\n '''Raises an exception in the threads with id tid'''\n if not inspect.isclass(exctype):\n raise TypeError(\"Only types can be raised (not instances)\")\n res = ctypes.pythonapi.PyThreadState_SetAsyncExc(tid,\n ctypes.py_object(exctype))\n if res == 0:\n raise ValueError(\"invalid thread id\")\n elif res != 1:\n # \"if it returns a number greater than one, you're in trouble,\n # and you should call it again with exc=NULL to revert the effect\"\n ctypes.pythonapi.PyThreadState_SetAsyncExc(tid, 0)\n raise SystemError(\"PyThreadState_SetAsyncExc failed\")\n\nclass ThreadWithExc(threading.Thread):\n '''A thread class that supports raising exception in the thread from\n another thread.\n '''\n def _get_my_tid(self):\n \"\"\"determines this (self's) thread id\n\n CAREFUL : this function is executed in the context of the caller\n thread, to get the identity of the thread represented by this\n instance.\n \"\"\"\n if not self.isAlive():\n raise threading.ThreadError(\"the thread is not active\")\n\n # do we have it cached?\n if hasattr(self, \"_thread_id\"):\n return self._thread_id\n\n # no, look for it in the _active dict\n for tid, tobj in threading._active.items():\n if tobj is self:\n self._thread_id = tid\n return tid\n\n # TODO: in python 2.6, there's a simpler way to do : self.ident\n\n raise AssertionError(\"could not determine the thread's id\")\n\n def raiseExc(self, exctype):\n \"\"\"Raises the given exception type in the context of this thread.\n\n If the thread is busy in a system call (time.sleep(),\n socket.accept(), ...), the exception is simply ignored.\n\n If you are sure that your exception should terminate the thread,\n one way to ensure that it works is:\n\n t = ThreadWithExc( ... )\n ...\n t.raiseExc( SomeException )\n while t.isAlive():\n time.sleep( 0.1 )\n t.raiseExc( SomeException )\n\n If the exception is to be caught by the thread, you need a way to\n check that your thread has caught it.\n\n CAREFUL : this function is executed in the context of the\n caller thread, to raise an excpetion in the context of the\n thread represented by this instance.\n \"\"\"\n _async_raise( self._get_my_tid(), exctype )\n\nAs noted in the documentation, this is not a magic bullet because if the thread is busy outside the Python interpreter, it will not catch the interruption.\nA good usage pattern of this code is to have the thread catch a specific exception and perform the cleanup. That way, you can interrupt a task and still have proper cleanup.\n"} {"input": "Behaviour of increment and decrement operators in Python I notice that a pre-increment/decrement operator can be applied on a variable (like ++count). It compiles, but it does not actually change the value of the variable!\nWhat is the behavior of the pre-increment/decrement operators (++/--) in Python? \nWhy does Python deviate from the behavior of these operators seen in C/C++?\n", "output": "++ is not an operator. It is two + operators. The + operator is the identity operator, which does nothing. (Clarification: the + and - unary operators only work on numbers, but I presume that you wouldn't expect a hypothetical ++ operator to work on strings.)\n++count\n\nParses as\n+(+count)\n\nWhich translates to\ncount\n\nYou have to use the slightly longer += operator to do what you want to do:\ncount += 1\n\nI suspect the ++ and -- operators were left out for consistency and simplicity. I don't know the exact argument Guido van Rossum gave for the decision, but I can imagine a few arguments:\n\nSimpler parsing. Technically, parsing ++count is ambiguous, as it could be +, +, count (two unary + operators) just as easily as it could be ++, count (one unary ++ operator). It's not a significant syntactic ambiguity, but it does exist.\nSimpler language. ++ is nothing more than a synonym for += 1. It was a shorthand invented because C compilers were stupid and didn't know how to optimize a += 1 into the inc instruction most computers have. In this day of optimizing compilers and bytecode interpreted languages, adding operators to a language to allow programmers to optimize their code is usually frowned upon, especially in a language like Python that is designed to be consistent and readable.\nConfusing side-effects. One common newbie error in languages with ++ operators is mixing up the differences (both in precedence and in return value) between the pre- and post-incremend/decrement operators, and Python likes to eliminate language \"gotcha\"-s. The precedence issues of pre-/post-increment in C are pretty hairy, and incredibly easy to mess up.\n\n"} {"input": "Understanding kwargs in Python What are the uses for **kwargs in Python?\nI know you can do an objects.filter on a table and pass in a **kwargs argument. \u00a0\nCan I also do this for specifying time deltas i.e. timedelta(hours = time1)? \nHow exactly does it work? Is it classes as 'unpacking'? Like a,b=1,2?\n", "output": "You can use **kwargs to let your functions take an arbitrary number of keyword arguments (\"kwargs\" means \"keyword arguments\"):\n>>> def print_keyword_args(**kwargs):\n... # kwargs is a dict of the keyword args passed to the function\n... for key, value in kwargs.iteritems():\n... print \"%s = %s\" % (key, value)\n... \n>>> print_keyword_args(first_name=\"John\", last_name=\"Doe\")\nfirst_name = John\nlast_name = Doe\n\nYou can also use the **kwargs syntax when calling functions by constructing a dictionary of keyword arguments and passing it to your function:\n>>> kwargs = {'first_name': 'Bobby', 'last_name': 'Smith'}\n>>> print_keyword_args(**kwargs)\nfirst_name = Bobby\nlast_name = Smith\n\nThe Python Tutorial contains a good explanation of how it works, along with some nice examples.\n"} {"input": "How do I find the location of my Python site-packages directory? How do I find the location of my site-packages directory?\n", "output": ">>> import site; site.getsitepackages()\n['/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages']\n\n(or just first item with site.getsitepackages()[0])\n"} {"input": "How can I reverse a list in python? How can I do this in python?\narray = [0,10,20,40]\nfor (i = array.length() - 1 ;i >= 0; i--)\n\nI need to have the elements of an array but from the end to the beginning.\n", "output": ">>> L = [0,10,20,40]\n>>> L[::-1]\n[40, 20, 10, 0]\n\nExtended slice syntax is explained well here: http://docs.python.org/release/2.3.5/whatsnew/section-slices.html\nBy special request in a comment this is the most current slice documentation.\n"} {"input": "Try/Except in Python: How do you properly ignore Exceptions? When you just want to do a try-except without handling the exception, how do you do it in Python?\nIs the following the right way to do it?\ntry :\n shutil.rmtree ( path )\nexcept :\n pass\n\n", "output": "try:\n doSomething()\nexcept: \n pass\n\nor\ntry:\n doSomething()\nexcept Exception: \n pass\n\nThe difference is, that the first one will also catch KeyboardInterrupt, SystemExit and stuff like that, which are derived directly from exceptions.BaseException, not exceptions.Exception. See documentation for details:\n\ntry statement \u2014 http://docs.python.org/reference/compound_stmts.html#try\nexceptions \u2014 http://docs.python.org/library/exceptions\n\n"} {"input": "How to pip install packages according to requirements.txt from a local directory? Here is the problem\nI have a requirements.txt that looks like:\nBeautifulSoup==3.2.0\nDjango==1.3\nFabric==1.2.0\nJinja2==2.5.5\nPyYAML==3.09\nPygments==1.4\nSQLAlchemy==0.7.1\nSouth==0.7.3\namqplib==0.6.1\nanyjson==0.3\n...\n\nI have a local archive directory containing all the packages + others.\nI have created a new virtualenv with\nbin/virtualenv testing\n\nupon activating it, I tried to install the packages according to requirements.txt from the local archive directory.\nsource bin/activate\npip install -r /path/to/requirements.txt -f file:///path/to/archive/\n\nI got some output that seems to indicate that the installation is fine\nDownloading/unpacking Fabric==1.2.0 (from -r ../testing/requirements.txt (line 3))\n Running setup.py egg_info for package Fabric\n warning: no previously-included files matching '*' found under directory 'docs/_build'\n warning: no files found matching 'fabfile.py'\nDownloading/unpacking South==0.7.3 (from -r ../testing/requirements.txt (line 8))\n Running setup.py egg_info for package South\n....\n\nBut later check revealed none of the package is installed properly. I cannot import the package, and none is found in the site-packages directory of my virtualenv. So what went wrong?\n", "output": "I've read the above, realize this is an old question, but it's totally unresolved and still at the top of my google search results so here's an answer that works for everyone:\npip install -r /path/to/requirements.txt\n\n"} {"input": "What is a clean, pythonic way to have multiple constructors in Python? I can't find a definitive answer for this. AFAIK, you can't have multiple __init__ functions in a Python class. So what is a good way to solve this problem? \nSuppose I have an class called Cheese with the number_of_holes property. How can I have two ways of creating cheese-objects...\n\none that takes a number of holes like this: parmesan = Cheese(num_holes = 15)\nand one that takes no arguments and just randomizes the number_of_holes property: gouda = Cheese()\n\nI can think of only one way to do this, but that seems kinda clunky:\nclass Cheese():\n def __init__(self, num_holes = 0):\n if (num_holes == 0):\n # randomize number_of_holes\n else:\n number_of_holes = num_holes\n\nWhat do you say? Is there a better way?\n", "output": "Actually None is much better for \"magic\" values:\nclass Cheese():\n def __init__(self, num_holes = None):\n if num_holes is None:\n ...\n\nNow if you want complete freedom of adding more parameters:\nclass Cheese():\n def __init__(self, *args, **kwargs):\n #args -- tuple of anonymous arguments\n #kwargs -- dictionary of named arguments\n self.num_holes = kwargs.get('num_holes',random_holes())\n\nTo better explain the concept of *args and **kwargs (you can actually change these names):\ndef f(*args, **kwargs):\n print 'args: ', args, ' kwargs: ', kwargs\n\n>>> f('a')\nargs: ('a',) kwargs: {}\n>>> f(ar='a')\nargs: () kwargs: {'ar': 'a'}\n>>> f(1,2,param=3)\nargs: (1, 2) kwargs: {'param': 3}\n\nhttp://docs.python.org/reference/expressions.html#calls\n"} {"input": "Save plot to image file instead of displaying it using Matplotlib (so it can be used in batch scripts for example) I am writing a quick-and-dirty script to generate plots on the fly. I am using the code below (from Matplotlib documentation) as a starting point:\nfrom pylab import figure, axes, pie, title, show\n\n# Make a square figure and axes\nfigure(1, figsize=(6, 6))\nax = axes([0.1, 0.1, 0.8, 0.8])\n\nlabels = 'Frogs', 'Hogs', 'Dogs', 'Logs'\nfracs = [15, 30, 45, 10]\n\nexplode = (0, 0.05, 0, 0)\npie(fracs, explode=explode, labels=labels, autopct='%1.1f%%', shadow=True)\ntitle('Raining Hogs and Dogs', bbox={'facecolor': '0.8', 'pad': 5})\n\nshow() # Actually, don't show, just save to foo.png\n\nI don't want to display the plot on a GUI, instead, I want to save the plot to a file (say foo.png) - how do I do that?\n", "output": "While the question has been answered, I'd like to add some useful tips when using savefig. The file format can be specified by the extension:\nsavefig('foo.png')\nsavefig('foo.pdf')\n\nWill give a rasterized or vectorized output respectively, both which could be useful. In addition, you'll find that pylab leaves a generous, often undesirable, whitespace around the image. Remove it with:\nsavefig('foo.png', bbox_inches='tight')\n\n"} {"input": "What is the difference between range and xrange functions in Python 2.X? Apparently xrange is faster but I have no idea why it's faster (and no proof besides the anecdotal so far that it is faster) or what besides that is different about\nfor i in range(0, 20):\nfor i in xrange(0, 20):\n\n", "output": "range creates a list, so if you do range(1, 10000000) it creates a list in memory with 9999999 elements.\nxrange is a sequence object that evaluates lazily. \n"} {"input": "How to debug in Django, the good way? So, I started learning to code in Python and later Django. The first times it was hard looking at tracebacks and actually figure out what I did wrong and where the syntax error was. Some time has passed now and some way along the way, I guess I got a routine in debugging my Django code. As this was done early in my coding experience, I sat down and wondered if how I was doing this was ineffective and could be done faster. I usually manage to find and correct the bugs in my code, but I wonder if I should be doing it faster?\nI usually just use the debug info Django gives when enabled. When things do end up as I thought it would, I break the code flow a lot with a syntax error, and look at the variables at that point in the flow to figure out, where the code does something other than what I wanted.\nBut can this be improved? Are there some good tools or better ways to debug your Django code?\n", "output": "There are a bunch of ways to do it, but the most straightforward is to simply \nuse the Python debugger. Just add following line in to a Django view function:\nimport pdb; pdb.set_trace()\n\nIf you try to load that page in your browser, the browser will hang and you get a prompt to carry on debugging on actual executing code.\nHowever there are other options suggested by others (I won't recommend them):\n* return HttpResponse({variable to inspect})\n\n* print {variable to inspect}\n\n* raise Exception({variable to inspect})\n\nBut the Python Debugger (pdb) is highly recommended for all types of Python code. If you are already into pdb, you'd also want to have a look at IPDB that uses ipython for debugging. Happy Coding.\nA useful reference provided by Seafangs : Using the Python debugger in Django\n"} {"input": "Python List Comprehension Vs. Map Is there a reason to prefer using map() over list comprehension or vice versa? Is one generally more effecient or generally considered more pythonic than the other?\n", "output": "map may be microscopically faster in some cases (when you're NOT making a lambda for the purpose, but using the same function in map and a listcomp). List comprehensions may be faster in other cases and most (not all) pythonistas consider them more direct and clearer.\nAn example of the tiny speed advantage of map when using exactly the same function:\n$ python -mtimeit -s'xs=range(10)' 'map(hex, xs)'\n100000 loops, best of 3: 4.86 usec per loop\n$ python -mtimeit -s'xs=range(10)' '[hex(x) for x in xs]'\n100000 loops, best of 3: 5.58 usec per loop\n\nAn example of how performance comparison gets completely reversed when map needs a lambda:\n$ python -mtimeit -s'xs=range(10)' 'map(lambda x: x+2, xs)'\n100000 loops, best of 3: 4.24 usec per loop\n$ python -mtimeit -s'xs=range(10)' '[x+2 for x in xs]'\n100000 loops, best of 3: 2.32 usec per loop\n\n"} {"input": "Should you always favor xrange() over range()? Why or why not?\n", "output": "For performance, especially when you're iterating over a large range, xrange() is usually better. However, there are still a few cases why you might prefer range():\n\nIn python 3, range() does what xrange() used to do and xrange() does not exist. If you want to write code that will run on both Python 2 and Python 3, you can't use xrange().\nrange() can actually be faster in some cases - eg. if iterating over the same sequence multiple times. xrange() has to reconstruct the integer object every time, but range() will have real integer objects. (It will always perform worse in terms of memory however)\nxrange() isn't usable in all cases where a real list is needed. For instance, it doesn't support slices, or any list methods.\n\n[Edit] There are a couple of posts mentioning how range() will be upgraded by the 2to3 tool. For the record, here's the output of running the tool on some sample usages of range() and xrange()\nRefactoringTool: Skipping implicit fixer: buffer\nRefactoringTool: Skipping implicit fixer: idioms\nRefactoringTool: Skipping implicit fixer: ws_comma\n--- range_test.py (original)\n+++ range_test.py (refactored)\n@@ -1,7 +1,7 @@\n\n for x in range(20):\n- a=range(20)\n+ a=list(range(20))\n b=list(range(20))\n c=[x for x in range(20)]\n d=(x for x in range(20))\n- e=xrange(20)\n+ e=range(20)\n\nAs you can see, when used in a for loop or comprehension, or where already wrapped with list(), range is left unchanged.\n"} {"input": "How does Python's super() work with multiple inheritance? I'm pretty much new in Python object oriented programming and I have trouble\nunderstanding the super() function (new style classes) especially when it comes to multiple inheritance.\nFor example if you have something like:\nclass First(object):\n def __init__(self):\n print \"first\"\n\nclass Second(object):\n def __init__(self):\n print \"second\"\n\nclass Third(First, Second):\n def __init__(self):\n super(Third, self).__init__()\n print \"that's it\"\n\nWhat I don't get is: will the Third() class inherit both constructor methods? If yes, then which one will be run with super() and why?\nAnd what if you want to run the other one? I know it has something to do with Python method resolution order (MRO). \n", "output": "This is detailed with a reasonable amount of detail by Guido himself at http://python-history.blogspot.com/2010/06/method-resolution-order.html (including two earlier attempts).\nBut, briefly: in your example, Third() will call First.__init__. For such simple situations, Python will look for the attribute (in this case, __init__) on the class's parents, left to right. So, if you define\nclass Third(First, Second):\n ...\n\nPython will look at First, and, if First doesn't have the attribute, at Second.\nThis situation becomes more complex when inheritance starts crossing paths (say, if First inherited from Second, for instance). Read the link above for more details, but, in a nutshell, Python will try to maintain the order in which each class appears on the inheritance list, child classes first.\nSo, for instance, if you had:\nclass First(object):\n def __init__(self):\n print \"first\"\n\nclass Second(First):\n def __init__(self):\n print \"second\"\n\nclass Third(First):\n def __init__(self):\n print \"third\"\n\nclass Fourth(Second, Third):\n def __init__(self):\n super(Fourth, self).__init__()\n print \"that's it\"\n\nthe MRO would be [Fourth, Second, Third, First].\nBy the way: if Python cannot find a coherent method resolution order, it'll raise an exception, instead of falling back to a behaviour which might surprise the user.\nEdited to add example of an ambiguous MRO:\nclass First(object):\n def __init__(self):\n print \"first\"\n\nclass Second(First):\n def __init__(self):\n print \"second\"\n\nclass Third(First, Second):\n def __init__(self):\n print \"third\"\n\nShould Third's MRO be [First, Second] or [Second, First]? There's no obvious expectation, and Python will raise an error:\n\nTypeError: Error when calling the\n metaclass bases\n Cannot create a consistent method resolution order (MRO) for bases\n Second, First\n\n[Edit] I see several people arguing that the examples above lack super() calls, so let me explain: the point of the examples is to show how the MRO is constructed. They are not intended print \"first\\nsecond\\third\" or whatever. You can - and should, of course, play around with the example, add super() calls, see what happens, and gain a deeper understanding of Python's inheritance model. But my goal here is to keep it simple and show how the MRO is build. And it is built as I explained:\n>>> Fourth.__mro__\n(, , , , )\n\n"} {"input": "Python class inherits object Is there any reason for a class declaration to inherit from object?\nI just found some code that does this and I can't find a good reason why.\nclass MyClass(object):\n # class code follows...\n\n", "output": "Python 3.x:\nclass MyClass(object): = new-style class\nclass MyClass: = new-style class (implicitly inherits from object)\nPython 2.x:\nclass MyClass(object): = new-style class\nclass MyClass: = OLD-STYLE CLASS\nExplanation:\nWhen defining base classes in Python 3.x, you\u00e2\u0080\u0099re allowed to drop the object from the definition. However, this can open the door for a seriously hard to track problem\u00e2\u0080\u00a6\nPython introduced new-style classes back in Python 2.2, and by now old-style classes are really quite old. Discussion of old-style classes is buried in the 2.x docs, and non-existent in the 3.x docs.\nThe problem is, the syntax for old-style classes in Python 2.x is the same as the alternative syntax for new-style classes in Python 3.x. Python 2.x is still very widely used (e.g. GAE, Web2Py), and any code (or coder) unwittingly bringing 3.x-style class definitions into 2.x code is going to end up with some seriously outdated base objects. And because old-style classes aren\u00e2\u0080\u0099t on anyone\u00e2\u0080\u0099s radar, they likely won\u00e2\u0080\u0099t know what hit them.\nSo just spell it out the long way and save some 2.x developer the tears.\n"} {"input": "How can I sort a dictionary by key? What would be a nice way to go from {2:3, 1:89, 4:5, 3:0} to {1:89, 2:3, 3:0, 4:5}?\nI checked some posts but they all use the \"sorted\" operator that returns tuples. \n", "output": "Standard Python dictionaries are unordered. Even if you sorted the (key,value) pairs, you wouldn't be able to store them in a dict in a way that would preserve the ordering.\nThe easiest way is to use OrderedDict, which remembers the order in which the elements have been inserted:\nIn [1]: import collections\n\nIn [2]: d = {2:3, 1:89, 4:5, 3:0}\n\nIn [3]: od = collections.OrderedDict(sorted(d.items()))\n\nIn [4]: od\nOut[4]: OrderedDict([(1, 89), (2, 3), (3, 0), (4, 5)])\n\nNever mind the way od is printed out; it'll work as expected:\nIn [11]: od[1]\nOut[11]: 89\n\nIn [12]: od[3]\nOut[12]: 0\n\nIn [13]: for k, v in od.iteritems(): print k, v\n ....: \n1 89\n2 3\n3 0\n4 5\n\nPython 3\nFor Python 3 users, one needs to use the .items() instead of .iteritems():\nIn [13]: for k, v in od.items(): print(k, v)\n ....: \n1 89\n2 3\n3 0\n4 5\n\n"} {"input": "Adding a Method to an Existing Object Instance I've read that it is possible to add a method to an existing object (e.g. not in the class definition) in Python, I think this is called Monkey Patching (or in some cases Duck Punching). I understand that it's not always a good decision to do so. But, how might one do this?\nUPDATE 8/04/2008 00:21:01 EST:\nThat looks like a good answer John Downey, I tried it but it appears that it ends up being not a true method.\nYour example defines the new patch function with an argument of self, but if you write actual code that way, the now patched class method asks for an argument named self (it doesn't automagically recognize it as the object to which it is supposed to bind, which is what would happen if defined within the class definition), meaning you have to call class.patch(obj) instead of just class.patch() if you want the same functionality as a true method.\nIt looks like Python isn't really treating it as a method, but more just as a variable which happens to be a function (and as such is callable). Is there any way to attach an actual method to a class?\nOh, and Ryan, that isn't exactly what I was looking for (it isn't a builtin functionality), but it is quite cool nonetheless.\n", "output": "In Python, there is a difference between functions and bound methods.\n>>> def foo():\n... print \"foo\"\n...\n>>> class A:\n... def bar( self ):\n... print \"bar\"\n...\n>>> a = A()\n>>> foo\n\n>>> a.bar\n>\n>>>\n\nBound methods have been \"bound\" (how descriptive) to an instance, and that instance will be passed as the first argument whenever the method is called.\nCallables that are attributes of a class (as opposed to an instance) are still unbound, though, so you can modify the class definition whenever you want:\n>>> def fooFighters( self ):\n... print \"fooFighters\"\n...\n>>> A.fooFighters = fooFighters\n>>> a2 = A()\n>>> a2.fooFighters\n>\n>>> a2.fooFighters()\nfooFighters\n\nPreviously defined instances are updated as well (as long as they haven't overridden the attribute themselves):\n>>> a.fooFighters()\nfooFighters\n\nThe problem comes when you want to attach a method to a single instance:\n>>> def barFighters( self ):\n... print \"barFighters\"\n...\n>>> a.barFighters = barFighters\n>>> a.barFighters()\nTraceback (most recent call last):\n File \"\", line 1, in \nTypeError: barFighters() takes exactly 1 argument (0 given)\n\nThe function is not automatically bound when it's attached directly to an instance:\n>>> a.barFighters\n\n\nTo bind it, we can use the MethodType function in the types module:\n>>> import types\n>>> a.barFighters = types.MethodType( barFighters, a )\n>>> a.barFighters\n>\n>>> a.barFighters()\nbarFighters\n\nThis time other instances of the class have not been affected:\n>>> a2.barFighters()\nTraceback (most recent call last):\n File \"\", line 1, in \nAttributeError: A instance has no attribute 'barFighters'\n\nMore information can be found by reading about descriptors and metaclass programming.\n"} {"input": "How do I read a text file into a string variable in Python I use the following code segment to read a file in python\nwith open (\"data.txt\", \"r\") as myfile:\n data=myfile.readlines()\n\ninput file is\nLLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN\nGGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE\n\nand when I print data I get\n['LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN\\n', 'GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE']\n\nAs I see data is in list form. How do I make it string. And also how do I remove \"\\n\", \"[\", and \"]\" characters from it ?\n", "output": "You could use:\nwith open('data.txt', 'r') as myfile:\n data=myfile.read().replace('\\n', '')\n\n"} {"input": "How to install psycopg2 with \"pip\" on Python? I'm using virtualenv and I need to install \"psycopg2\".\nI have done the following:\npip install http://pypi.python.org/packages/source/p/psycopg2/psycopg2-2.4.tar.gz#md5=24f4368e2cfdc1a2b03282ddda814160\n\nAnd I have the following messages:\nDownloading/unpacking http://pypi.python.org/packages/source/p/psycopg2/psycopg2\n-2.4.tar.gz#md5=24f4368e2cfdc1a2b03282ddda814160\n Downloading psycopg2-2.4.tar.gz (607Kb): 607Kb downloaded\n Running setup.py egg_info for package from http://pypi.python.org/packages/sou\nrce/p/psycopg2/psycopg2-2.4.tar.gz#md5=24f4368e2cfdc1a2b03282ddda814160\n Error: pg_config executable not found.\n\n Please add the directory containing pg_config to the PATH\n or specify the full executable path with the option:\n\n python setup.py build_ext --pg-config /path/to/pg_config build ...\n\n or with the pg_config option in 'setup.cfg'.\n Complete output from command python setup.py egg_info:\n running egg_info\n\ncreating pip-egg-info\\psycopg2.egg-info\n\nwriting pip-egg-info\\psycopg2.egg-info\\PKG-INFO\n\nwriting top-level names to pip-egg-info\\psycopg2.egg-info\\top_level.txt\n\nwriting dependency_links to pip-egg-info\\psycopg2.egg-info\\dependency_links.txt\n\nwriting manifest file 'pip-egg-info\\psycopg2.egg-info\\SOURCES.txt'\n\nwarning: manifest_maker: standard file '-c' not found\n\nError: pg_config executable not found.\n\n\n\nPlease add the directory containing pg_config to the PATH\n\nor specify the full executable path with the option:\n\n\n\n python setup.py build_ext --pg-config /path/to/pg_config build ...\n\n\n\nor with the pg_config option in 'setup.cfg'.\n\n----------------------------------------\nCommand python setup.py egg_info failed with error code 1\nStoring complete log in C:\\Documents and Settings\\anlopes\\Application Data\\pip\\p\nip.log\n\nMy question, I only need to do this to get the psycopg2 working?\npython setup.py build_ext --pg-config /path/to/pg_config build ...\n\n", "output": "I found this post searching for a Linux solution to this problem.\nThis post by \"goshawk\" gave me the solution: run sudo apt-get install libpq-dev python-dev if you are on Ubuntu/Debian.\nUpdate\nSince more people have found this answer useful and posted their own solutions, here's a list of them:\nDebian/Ubuntu\nPython 2\nsudo apt install libpq-dev python-dev\n\nPython 3\nsudo apt install libpq-dev python3-dev\n\nAdditional\nIf none of the above solve your issue, try\nsudo apt install build-essential\n\nor\nsudo apt install postgresql-server-dev-all\n\n"} {"input": "differentiate null=True, blank=True in django When we add a database field in django we generally write models.CharField(max_length=100, null=True, blank=True). The same is done with ForeignKey, DecimalField etc. What is the basic difference in having \n\nnull=True only\nblank=True only\nnull=True, blank=True\n\nin respect to different (CharField, ForeignKey, ManyToManyField, DateTimeField) fields. What are the advantages/disadvantages of using 1/2/3?\n", "output": "null=True sets NULL (versus NOT NULL) on the column in your DB. Blank values for Django field types such as DateTimeField or ForeignKey will be stored as NULL in the DB.\nblank=True determines whether the field will be required in forms. This includes the admin and your own custom forms. If blank=True then the field will not be required, whereas if it's False the field cannot be blank.\nThe combo of the two is so frequent because typically if you're going to allow a field to be blank in your form, you're going to also need your database to allow NULL values for that field. The exception is CharFields and TextFields, which in Django are never saved as NULL. Blank values are stored in the DB as an empty string ('').\nA few examples:\nmodels.DateTimeField(blank=True) # raises IntegrityError if blank\n\nmodels.DateTimeField(null=True) # NULL allowed, but must be filled out in a form\n\nObviously those two options don't make logical sense to use (though, there might be a use case for null=True, blank=False if you want a field to always be required in forms, but optional when dealing with an object through something like the shell.)\nmodels.CharField(blank=True) # No problem, blank is stored as ''\n\nmodels.CharField(null=True) # NULL allowed, but will never be set as NULL\n\nCHAR and TEXT types are never saved as NULL by Django, so null=True is unnecessary. However, you can manually set one of these fields to None to force set it as NULL. If you have a scenario where that might be necessary, you should still include null=True.\n"} {"input": "Is there a portable way to get the current username in Python? Is there a portable way to get the current user's username in Python (i.e., one that works under both Linux and Windows, at least). It would work like os.getuid:\n>>> os.getuid()\n42\n>>> os.getusername()\n'slartibartfast'\n\nI googled around and was surprised not to find a definitive answer (although perhaps I was just googling poorly). The pwd module provides a relatively easy way to achieve this under, say, Linux, but it is not present on Windows. Some of the search results suggested that getting the username under Windows can be complicated in certain circumstances (e.g., running as a Windows service), although I haven't verified that.\n", "output": "Look at getpass module\nimport getpass\ngetpass.getuser()\n'kostya'\n\nAvailability: Unix, Windows\n\np.s. Per comment below \"this function looks at the values of various environment variables to determine the user name. Therefore, this function should not be relied on for access control purposes (or possibly any other purpose, since it allows any user to impersonate any other).\"\n"} {"input": "Delete column from pandas DataFrame When deleting a column in a DataFrame I use:\ndel df['column_name']\n\nand this works great. Why can't I use:\ndel df.column_name\n\nAs you can access the column/Series as df.column_name, I expect this to work.\n", "output": "The best way to do this in pandas is to use drop:\ndf = df.drop('column_name', 1)\n\nwhere 1 is the axis number (0 for rows and 1 for columns.)\nTo delete the column without having to reassign df you can do:\ndf.drop('column_name', axis=1, inplace=True)\n\nFinally, to drop by column number instead of by column label, try this to delete, e.g. the 1st, 2nd and 4th columns:\ndf.drop(df.columns[[0, 1, 3]], axis=1) # df.columns is zero-based pd.Index \n\n"} {"input": "How to check file size in python? I am writing a Python script in Windows. I want to do something based on the file size. For example, if the size is greater than 0, I will send an email to somebody, otherwise continue to other things. \nHow do I check the file size?\n", "output": "Like this (credit http://www.daniweb.com/forums/thread78629.html):\n>>> import os\n>>> b = os.path.getsize(\"/path/isa_005.mp3\")\n>>> b\n2071611L\n\n"} {"input": "Emulate a do-while loop in Python? I need to emulate a do-while loop in a Python program. Unfortunately, the following straightforward code does not work:\nl = [ 1, 2, 3 ]\ni = l.__iter__()\ns = None\nwhile True :\n if s :\n print s\n try :\n s = i.next()\n except StopIteration :\n break\nprint \"done\"\n\nInstead of \"1,2,3,done\", it prints the following output:\n[stdout:]1\n[stdout:]2\n[stdout:]3\nNone['Traceback (most recent call last):\n', ' File \"test_python.py\", line 8, in \n s = i.next()\n', 'StopIteration\n']\n\nWhat can I do in order to catch the 'stop iteration' exception and break a while\nloop properly?\nAn example of why such a thing may be needed is shown below as pseudocode.\nState machine:\ns = \"\"\nwhile True :\n if state is STATE_CODE :\n if \"//\" in s :\n tokens.add( TOKEN_COMMENT, s.split( \"//\" )[1] )\n state = STATE_COMMENT\n else :\n tokens.add( TOKEN_CODE, s )\n if state is STATE_COMMENT :\n if \"//\" in s :\n tokens.append( TOKEN_COMMENT, s.split( \"//\" )[1] )\n else\n state = STATE_CODE\n # Re-evaluate same line\n continue\n try :\n s = i.next()\n except StopIteration :\n break\n\n", "output": "I am not sure what you are trying to do. You can implement a do-while loop like this:\nwhile True:\n stuff()\n if fail_condition:\n break\n\nOr:\nstuff()\nwhile not fail_condition:\n stuff()\n\nWhat are you doing trying to use a do while loop to print the stuff in the list? Why not just use:\nfor i in l:\n print i\nprint \"done\"\n\nUpdate:\nSo do you have a list of lines? And you want to keep iterating through it? How about: \nfor s in l: \n while True: \n stuff() \n # use a \"break\" instead of s = i.next()\n\nDoes that seem like something close to what you would want? With your code example, it would be:\nfor s in some_list:\n while True :\n if state is STATE_CODE :\n if \"//\" in s :\n tokens.add( TOKEN_COMMENT, s.split( \"//\" )[1] )\n state = STATE_COMMENT\n else :\n tokens.add( TOKEN_CODE, s )\n if state is STATE_COMMENT :\n if \"//\" in s :\n tokens.append( TOKEN_COMMENT, s.split( \"//\" )[1] )\n break # get next s\n else\n state = STATE_CODE\n # re-evaluate same line\n # continues automatically\n\n"} {"input": "Display number with leading zeros Given:\na = 1\nb = 10\nc = 100\n\nI want to display a leading zero for all numbers with less than 2 digits, i.e.:\n01\n10\n100\n\n", "output": "Here you are:\nprint \"%02d\" % (1,)\n\nBasically % is like printf or sprint.\n"} {"input": "Pythonic way to create a long multi-line string I have a very long query. I would like to split it in several lines in python. A way to do it in javascript would be using several sentences and joining them with a + operator (I know, maybe it's not the most efficient way to do it, but I'm not really concerned about performance in this stage, just code readability). Example:\nvar long_string='some text not important. just garbage to'+\n 'illustrate my example';\n\nI tried doing something similar in python, but it didn't work, so I used \\ to split the long string. However, I'm not sure if this is the only/best/pythonicest way of doing it. It looks awkward. \nActual code:\nquery= 'SELECT action.descr as \"action\", '\\\n 'role.id as role_id,'\\\n 'role.descr as role'\\\n 'FROM '\\\n 'public.role_action_def,'\\\n 'public.role,'\\\n 'public.record_def, '\\\n 'public.action'\\\n 'WHERE role.id = role_action_def.role_id AND'\\\n 'record_def.id = role_action_def.def_id AND'\\\n 'action.id = role_action_def.action_id AND'\\\n 'role_action_def.account_id = '+account_id+' AND'\\\n 'record_def.account_id='+account_id+' AND'\\\n 'def_id='+def_id\n\n", "output": "Are you talking about multi-line strings? Easy, use triple quotes to start and end them.\ns = \"\"\" this is a very\n long string if I had the\n energy to type more and more ...\"\"\"\n\nYou can use single quotes too (3 of them of course at start and end) and treat the resulting string s just like any other string.\nNOTE: Just as with any string, anything between the starting and ending quotes becomes part of the string, so this example has a leading blank (as pointed out by @root45). This string will also contain both blanks and newlines.\nI.e.,:\n' this is a very\\n long string if I had the\\n energy to type more and more ...'\n\nFinally, one can also construct long lines in Python like this:\n s = (\"this is a very\"\n \"long string too\"\n \"for sure ...\"\n )\n\nwhich will not include any extra blanks or newlines (this is a deliberate example showing what the effect of skipping blanks will result in):\n'this is a verylong string toofor sure ...'\n\nNo commas required, simply place the strings to be joined together into a pair of parenthesis and be sure to account for any needed blanks and newlines.\n"} {"input": "How to install lxml on Ubuntu I'm having difficulty installing lxml with easy_install on Ubuntu 11.\nWhen I type $ easy_install lxml I get:\nSearching for lxml\nReading http://pypi.python.org/simple/lxml/\nReading http://codespeak.net/lxml\nBest match: lxml 2.3\nDownloading http://lxml.de/files/lxml-2.3.tgz\nProcessing lxml-2.3.tgz\nRunning lxml-2.3/setup.py -q bdist_egg --dist-dir /tmp/easy_install-7UdQOZ/lxml-2.3/egg-dist-tmp-GacQGy\nBuilding lxml version 2.3.\nBuilding without Cython.\nERROR: /bin/sh: xslt-config: not found\n\n** make sure the development packages of libxml2 and libxslt are installed **\n\nUsing build configuration of libxslt \nIn file included from src/lxml/lxml.etree.c:227:0:\nsrc/lxml/etree_defs.h:9:31: fatal error: libxml/xmlversion.h: No such file or directory\ncompilation terminated.\n\nIt seems that libxslt or libxml2 is not installed. I've tried following the instructions at http://www.techsww.com/tutorials/libraries/libxslt/installation/installing_libxslt_on_ubuntu_linux.php and http://www.techsww.com/tutorials/libraries/libxml/installation/installing_libxml_on_ubuntu_linux.php with no success.\nIf I try wget ftp://xmlsoft.org/libxml2/libxml2-sources-2.6.27.tar.gz I get\n\n==> SYST ... done. ==> PWD ... done.\n==> TYPE I ... done. ==> CWD (1) /libxml2 ... done.\n==> SIZE libxml2-sources-2.6.27.tar.gz ... done.\n==> PASV ... done. ==> RETR libxml2-sources-2.6.27.tar.gz ... \nNo such file `libxml2-sources-2.6.27.tar.gz'.\n\nIf I try the other first, I'll get to ./configure --prefix=/usr/local/libxslt --with-libxml-prefix=/usr/local/libxml2 and that will fail eventually with:\nchecking for libxml libraries >= 2.6.27... configure: error: Could not find libxml2 anywhere, check ftp://xmlsoft.org/.\n\nI've tried both versions 2.6.27 and 2.6.29 of libxml2 with no difference.\nLeaving no stone unturned, I have successfully done sudo apt-get install libxml2-dev, but this changes nothing.\n", "output": "Since you're on Ubuntu, don't bother with those source packages. Just install those development packages using apt-get.\napt-get install libxml2-dev libxslt1-dev python-dev\n\nIf you're happy with a possibly older version of lxml altogether though, you could try\napt-get install python-lxml\n\nand be done with it. :)\n"} {"input": "How does the @property decorator work? I would like to understand how the built-in function property works. What confuses me is that property can also be used as a decorator, but it only takes arguments when used as a built-in function and not when used as a decorator.\nThis example is from the documentation:\nclass C(object):\n def __init__(self):\n self._x = None\n\n def getx(self):\n return self._x\n def setx(self, value):\n self._x = value\n def delx(self):\n del self._x\n x = property(getx, setx, delx, \"I'm the 'x' property.\")\n\nproperty's arguments are getx, setx, delx and a doc string.\nIn the code below property is used as decorator. The object of it is the x function, but in the code above there is no place for an object function in the arguments.\nclass C(object):\n def __init__(self):\n self._x = None\n\n @property\n def x(self):\n \"\"\"I'm the 'x' property.\"\"\"\n return self._x\n\n @x.setter\n def x(self, value):\n self._x = value\n\n @x.deleter\n def x(self):\n del self._x\n\nAnd, how are the x.setter and x.deleter decorators created?\nI am confused.\n", "output": "The property() function returns a special descriptor object:\n>>> property()\n\n\nIt is this object that has extra methods:\n>>> property().getter\n\n>>> property().setter\n\n>>> property().deleter\n\n\nThese act as decorators too. They return a new property object:\n>>> property().getter(None)\n\n\nthat is a copy of the old object, but with one of the functions replaced.\nRemember, that the @decorator syntax is just syntactic sugar; the syntax:\n@property\ndef foo(self): return self._foo\n\nreally means the same thing as\ndef foo(self): return self._foo\nfoo = property(foo)\n\nso foo the function is replaced by property(foo), which we saw above is a special object. Then when you use @foo.setter(), what you are doing is call that property().setter method I showed you above, which returns a new copy of the property, but this time with the setter function replaced with the decorated method.\nThe following sequence also creates a full-on property, by using those decorator methods.\nFirst we create some functions and a property object with just a getter:\n>>> def getter(self): print 'Get!'\n... \n>>> def setter(self, value): print 'Set to {!r}!'.format(value)\n... \n>>> def deleter(self): print 'Delete!'\n... \n>>> prop = property(getter)\n>>> prop.fget is getter\nTrue\n>>> prop.fset is None\nTrue\n>>> prop.fdel is None\nTrue\n\nNext we use the .setter() method to add a setter:\n>>> prop = prop.setter(setter)\n>>> prop.fget is getter\nTrue\n>>> prop.fset is setter\nTrue\n>>> prop.fdel is None\nTrue\n\nLast we add a deleter with the .deleter() method:\n>>> prop = prop.deleter(deleter)\n>>> prop.fget is getter\nTrue\n>>> prop.fset is setter\nTrue\n>>> prop.fdel is deleter\nTrue\n\nLast but not least, the property object acts as a descriptor object, so it has .__get__(), .__set__() and .__delete__() methods to hook into instance attribute getting, setting and deleting:\n>>> class Foo(object): pass\n... \n>>> prop.__get__(Foo(), Foo)\nGet!\n>>> prop.__set__(Foo(), 'bar')\nSet to 'bar'!\n>>> prop.__delete__(Foo())\nDelete!\n\nThe Descriptor Howto includes a pure python sample implementation of the property() type:\n\nclass Property(object):\n \"Emulate PyProperty_Type() in Objects/descrobject.c\"\n\n def __init__(self, fget=None, fset=None, fdel=None, doc=None):\n self.fget = fget\n self.fset = fset\n self.fdel = fdel\n if doc is None and fget is not None:\n doc = fget.__doc__\n self.__doc__ = doc\n\n def __get__(self, obj, objtype=None):\n if obj is None:\n return self\n if self.fget is None:\n raise AttributeError(\"unreadable attribute\")\n return self.fget(obj)\n\n def __set__(self, obj, value):\n if self.fset is None:\n raise AttributeError(\"can't set attribute\")\n self.fset(obj, value)\n\n def __delete__(self, obj):\n if self.fdel is None:\n raise AttributeError(\"can't delete attribute\")\n self.fdel(obj)\n\n def getter(self, fget):\n return type(self)(fget, self.fset, self.fdel, self.__doc__)\n\n def setter(self, fset):\n return type(self)(self.fget, fset, self.fdel, self.__doc__)\n\n def deleter(self, fdel):\n return type(self)(self.fget, self.fset, fdel, self.__doc__)\n\n\n"} {"input": "Retrieving python module path I want to detect whether module has changed. Now, using inotify is simple, you just need to know the directory you want to get notifications from.\nHow do I retrieve a module's path in python?\n", "output": "import a_module\nprint a_module.__file__\n\nWill actually give you the path to the .pyc file that was loaded, at least on Mac OS X. So I guess you can do\nimport os\npath = os.path.dirname(amodule.__file__)\n\nYou can also try\npath = os.path.abspath(amodule.__file__)\n\nTo get the directory to look for changes.\n"} {"input": "Removing duplicates in lists Pretty much I need to write a program to check if a list has any duplicates and if it does it removes them and returns a new list with the items that werent duplicated/removed. This is what I have but to be honest I do not know what to do.\ndef remove_duplicates():\n t = ['a', 'b', 'c', 'd']\n t2 = ['a', 'c', 'd']\n for t in t2:\n t.append(t.remove())\n return t\n\n", "output": "The common approach to get a unique collection of items is to use a set. Sets are unordered collections of distinct objects. To create a set from any iterable, you can simply pass it to the built-in set() function. If you later need a real list again, you can similarly pass the set to the list() function.\nThe following example should cover whatever you are trying to do:\n>>> t = [1, 2, 3, 1, 2, 5, 6, 7, 8]\n>>> t\n[1, 2, 3, 1, 2, 5, 6, 7, 8]\n>>> list(set(t))\n[1, 2, 3, 5, 6, 7, 8]\n>>> s = [1, 2, 3]\n>>> list(set(t) - set(s))\n[8, 5, 6, 7]\n\nAs you can see from the example result, the original order is not maintained. As mentioned above, sets themselves are unordered collections, so the order is lost. When converting a set back to a list, an arbitrary order is created.\nIf order is important to you, then you will have to use a different mechanism. This question covers that topic in more detail.\n"} {"input": "What is the difference between dict.items() and dict.iteritems()? Are there any applicable differences between dict.items() and dict.iteritems()?\nFrom the Python docs:\n\ndict.items(): Return a copy of the dictionary\u00e2\u0080\u0099s list of (key, value) pairs.\ndict.iteritems(): Return an iterator over the dictionary\u00e2\u0080\u0099s (key, value) pairs.\n\nIf I run the code below, each seems to return a reference to the same object. Are there any subtle differences that I am missing?\n#!/usr/bin/python\n\nd={1:'one',2:'two',3:'three'}\nprint 'd.items():'\nfor k,v in d.items():\n if d[k] is v: print '\\tthey are the same object' \n else: print '\\tthey are different'\n\nprint 'd.iteritems():' \nfor k,v in d.iteritems():\n if d[k] is v: print '\\tthey are the same object' \n else: print '\\tthey are different' \n\nOutput:\nd.items():\n they are the same object\n they are the same object\n they are the same object\nd.iteritems():\n they are the same object\n they are the same object\n they are the same object\n\n", "output": "It's part of an evolution. \nOriginally, Python items() built a real list of tuples and returned that. That could potentially take a lot of extra memory. \nThen, generators were introduced to the language in general, and that method was reimplemented as an iterator-generator method named iteritems(). The original remains for backwards compatibility. \nOne of Python 3\u00e2\u0080\u0099s changes is that items() now return iterators, and a list is never fully built. The iteritems() method is also gone, since items() in Python 3 works like viewitems() in Python 2.7. \n"} {"input": "Python `if x is not None` or `if not x is None`? I've always thought of the if not x is None version to be more clear, but Google's style guide implies (based on this excerpt) that they use if x is not None. Is there any minor performance difference (I'm assuming not), and is there any case where one really doesn't fit (making the other a clear winner for my convention)?*\n*I'm referring to any singleton, rather than just None.\n\n...to compare singletons like\n None. Use is or is not.\n\n", "output": "There's no performance difference, as they compile to the same bytecode:\nPython 2.6.2 (r262:71600, Apr 15 2009, 07:20:39)\n>>> import dis\n>>> def f(x):\n... return x is not None\n...\n>>> dis.dis(f)\n 2 0 LOAD_FAST 0 (x)\n 3 LOAD_CONST 0 (None)\n 6 COMPARE_OP 9 (is not)\n 9 RETURN_VALUE\n>>> def g(x):\n... return not x is None\n...\n>>> dis.dis(g)\n 2 0 LOAD_FAST 0 (x)\n 3 LOAD_CONST 0 (None)\n 6 COMPARE_OP 9 (is not)\n 9 RETURN_VALUE\n\nStylistically, I try to avoid not x is y. Although the compiler will always treat it as not (x is y), a human reader might misunderstand the construct as (not x) is y. If I write x is not y then there is no ambiguity.\n"} {"input": "How do I do a not equal in Django queryset filtering? In Django model QuerySets, I see that there is a __gt and __lt for comparitive values, but is there a __ne/!=/<> (not equals?)\nI want to filter out using a not equals:\nExample:\nModel:\n bool a;\n int x;\n\nI want\nresults = Model.objects.exclude(a=true, x!=5)\n\nThe != is not correct syntax. I tried __ne, <>.\nI ended up using:\nresults = Model.objects.exclude(a=true, x__lt=5).exclude(a=true, x__gt=5)\n\n", "output": "Maybe Q objects could be of help for this problem. I've never used them but it seems they can be negated and combined much like normal python expressions.\nUpdate: I Just tried it out, it seems to work pretty well:\n>>> from myapp.models import Entry\n>>> from django.db.models import Q\n\n>>> Entry.objects.filter(~Q(id = 3))\n\n[, , , ...]\n\n"} {"input": "What is the Python 3 equivalent of \"python -m SimpleHTTPServer\" What is the Python 3 equivalent of python -m SimpleHTTPServer?\n", "output": "From the docs:\n\nThe SimpleHTTPServer module has been merged into http.server in Python 3.0. The 2to3 tool will automatically adapt imports when converting your sources to 3.0.\n\nSo, your command is python3 -m http.server.\n"} {"input": "Why are Python's 'private' methods not actually private? Python gives us the ability to create 'private' methods and variables within a class by prepending double underscores to the name, like this: __myPrivateMethod(). How, then, can one explain this\n>>> class MyClass:\n... def myPublicMethod(self):\n... print 'public method'\n... def __myPrivateMethod(self):\n... print 'this is private!!'\n... \n>>> obj = MyClass()\n>>> obj.myPublicMethod()\npublic method\n>>> obj.__myPrivateMethod()\nTraceback (most recent call last):\n File \"\", line 1, in \nAttributeError: MyClass instance has no attribute '__myPrivateMethod'\n>>> dir(obj)\n['_MyClass__myPrivateMethod', '__doc__', '__module__', 'myPublicMethod']\n>>> obj._MyClass__myPrivateMethod()\nthis is private!!\n\nWhat's the deal?!\nI'll explain this a little for those who didn't quite get that.\n>>> class MyClass:\n... def myPublicMethod(self):\n... print 'public method'\n... def __myPrivateMethod(self):\n... print 'this is private!!'\n... \n>>> obj = MyClass()\n\nWhat I did there is create a class with a public method and a private method and instantiate it.\nNext, I call its public method.\n>>> obj.myPublicMethod()\npublic method\n\nNext, I try and call its private method.\n>>> obj.__myPrivateMethod()\nTraceback (most recent call last):\n File \"\", line 1, in \nAttributeError: MyClass instance has no attribute '__myPrivateMethod'\n\nEverything looks good here; we're unable to call it. It is, in fact, 'private'. Well, actually it isn't. Running dir() on the object reveals a new magical method that python creates magically for all of your 'private' methods.\n>>> dir(obj)\n['_MyClass__myPrivateMethod', '__doc__', '__module__', 'myPublicMethod']\n\nThis new method's name is always an underscore, followed by the class name, followed by the method name.\n>>> obj._MyClass__myPrivateMethod()\nthis is private!!\n\nSo much for encapsulation, eh?\nIn any case, I'd always heard Python doesn't support encapsulation, so why even try? What gives?\n", "output": "The name scrambling is used to ensure that subclasses don't accidentally override the private methods and attributes of their superclasses. It's not designed to prevent deliberate access from outside.\nFor example:\n>>> class Foo(object):\n... def __init__(self):\n... self.__baz = 42\n... def foo(self):\n... print self.__baz\n... \n>>> class Bar(Foo):\n... def __init__(self):\n... super(Bar, self).__init__()\n... self.__baz = 21\n... def bar(self):\n... print self.__baz\n...\n>>> x = Bar()\n>>> x.foo()\n42\n>>> x.bar()\n21\n>>> print x.__dict__\n{'_Bar__baz': 21, '_Foo__baz': 42}\n\nOf course, it breaks down if two different classes have the same name.\n"} {"input": "Python __init__ and self what do they do? I'm learning the Python programming language, and I've come across certain things I don't fully understand. I'm coming from a C background, but I never went far with that either.\nWhat I'm trying to figure out is:\nIn a method:\ndef method(self, blah):\n def __init__(?):\n ....\n ....\n\nWhat does self do? what is it meant to be? and is it mandatory?\nWhat does the __init__ method do? why is it necessary? etc\nI think they might be oop constructs, but I don't know very much..\n", "output": "In this code:\nclass A(object):\n def __init__(self):\n self.x = 'Hello'\n\n def method_a(self, foo):\n print self.x + ' ' + foo\n\n... the self variable represents the instance of the object itself. Most object-oriented languages pass this as a hidden parameter to the methods defined on an object; Python does not. You have to declare it explicitly. When you create an instance of the A class and call its methods, it will be passed automatically, as in ...\na = A() # We do not pass any argument to the __init__ method\na.method_a('Sailor!') # We only pass a single argument\n\nThe __init__ method is roughly what represents a constructor in Python. When you call A() Python creates an object for you, and passes it as the first parameter to the __init__ method. Any additional parameters (e.g., A(24, 'Hello')) will also get passed as arguments--in this case causing an exception to be raised, since the constructor isn't expecting them.\n"} {"input": "Running shell command from Python and capturing the output I want to write a function that will execute a shell command and return its output as a string, no matter, is it an error or success message. I just want to get the same result that I would have gotten with the command line.\nWhat would be a code example that would do such a thing?\nFor example:\ndef run_command(cmd):\n # ??????\n\nprint run_command('mysqladmin create test -uroot -pmysqladmin12')\n# Should output something like:\n# mysqladmin: CREATE DATABASE failed; error: 'Can't create database 'test'; database exists'\n\n", "output": "For convenience, Python 2.7 provides the \nsubprocess.check_output(*popenargs, **kwargs) \n\nfunction, which takes the same arguments as Popen, but returns a string containing the program's output. You can pass stderr=subprocess.STDOUT to ensure that error messages are included in the returned output -- but don't pass stderr=subprocess.PIPE to check_output. It can cause deadlocks. If you need to pipe from stderr, see the Popen example below.\nIf you're using an older python, Vartec's method will work. But the better way to go -- at least in simple cases that don't require real-time output capturing -- is to use communicate. As in:\noutput = subprocess.Popen([\"mycmd\", \"myarg\"], stdout=subprocess.PIPE).communicate()[0]\n\nOr \n>>> import subprocess\n>>> p = subprocess.Popen(['ls', '-a'], stdout=subprocess.PIPE, \n... stderr=subprocess.PIPE)\n>>> out, err = p.communicate()\n>>> print out\n.\n..\nfoo\n\nIf you set stdin=PIPE, communicate also allows you to pass data to the process via stdin:\n>>> cmd = ['awk', 'length($0) > 5']\n>>> p = subprocess.Popen(cmd, stdout=subprocess.PIPE,\n... stderr=subprocess.PIPE,\n... stdin=subprocess.PIPE)\n>>> out, err = p.communicate('foo\\nfoofoo\\n')\n>>> print out\nfoofoo\n\nFinally, note Aaron Hall's answer, which indicates that on some systems, you may need to set stdout, stderr, and stdin all to PIPE (or DEVNULL) to get communicate to work at all.\n"} {"input": "How do I \"cd\" in Python? cd as in the shell command to change the working directory.\nHow do I change the current working directory in Python?\n", "output": "You can change the working directory with\nos.chdir(path)\n\nThere are two best practices to follow when using this method:\n\nCatch the exception (WindowsError, OSError) on invalid path. If the exception is thrown, do not perform any recursive operations, especially destructive ones. They will operate on the old path and not the new one.\nReturn to your old directory when you're done. This can be done in an exception-safe manner by wrapping your chdir call in a context manager, like Brian M. Hunt did in his answer.\n\nChanging the current working directory in a subprocess does not change the current working directory in the parent process. This is true of the Python interpreter as well. You cannot use os.chdir() to change the CWD of the calling process.\n"} {"input": "Python's equivalent of && in an if statement Here's my code:\n# F. front_back\n# Consider dividing a string into two halves.\n# If the length is even, the front and back halves are the same length.\n# If the length is odd, we'll say that the extra char goes in the front half.\n# e.g. 'abcde', the front half is 'abc', the back half 'de'.\n# Given 2 strings, a and b, return a string of the form\n# a-front + b-front + a-back + b-back\ndef front_back(a, b):\n # +++your code here+++\n if len(a) % 2 == 0 && len(b) % 2 == 0:\n return a[:(len(a)/2)] + b[:(len(b)/2)] + a[(len(a)/2):] + b[(len(b)/2):] \n else:\n #todo! Not yet done. :P\n return\n\nI'm getting an error in the IF conditional. What am I doing wrong?\n", "output": "You would want and instead of &&.\n"} {"input": "What are the differences between the urllib, urllib2, and requests module? In Python, what are the differences between the urllib, urllib2, and requests module? Why are there three? They seem to do the same thing...\n", "output": "I know its been said already, but I'd highly recommend the Requests python package:\nhttp://docs.python-requests.org/en/latest/index.html\nIf you've used languages other than python, you're probably thinking urllib and urllib2 are easy to use, not much code, and highly capable, that's how I used to think. But the Requests package is so unbelievably useful and short that everyone should be using it.\nFirst, it supports a fully restful API, and is as easy as:\nimport requests\n...\n\nresp = requests.get('http://www.mywebsite.com/user')\nresp = requests.post('http://www.mywebsite.com/user')\nresp = requests.put('http://www.mywebsite.com/user/put')\nresp = requests.delete('http://www.mywebsite.com/user/delete')\n\nRegardless of whether GET/POST you never have to encode parameters again, it simply takes a dictionary as an argument and is good to go.\nuserdata = {\"firstname\": \"John\", \"lastname\": \"Doe\", \"password\": \"jdoe123\"}\nresp = requests.post('http://www.mywebsite.com/user', data=userdata)\n\nPlus it even has a built in json decoder (again, i know json.loads() isn't a lot more to write, but this sure is convenient):\nresp.json()\n\nOr if your response data is just text, use:\nresp.text\n\nThis is just the tip of the iceberg. This is the list of features from the requests site:\n\nInternational Domains and URLs\nKeep-Alive & Connection Pooling\nSessions with Cookie Persistence\nBrowser-style SSL Verification\nBasic/Digest Authentication\nElegant Key/Value Cookies\nAutomatic Decompression\nUnicode Response Bodies\nMultipart File Uploads\nConnection Timeouts\n.netrc support\nList item\nPython 2.6\u00e2\u0080\u00943.4\nThread-safe.\n\n"} {"input": "Is there a simple way to delete a list element by value in python? a=[1,2,3,4]\nb=a.index(6)\ndel a[b]\nprint a\n\nThe above shows the following error:\nTraceback (most recent call last):\n File \"D:\\zjm_code\\a.py\", line 6, in \n b=a.index(6)\nValueError: list.index(x): x not in list\n\nSo I have to do this:\na=[1,2,3,4]\ntry:\n b=a.index(6)\n del a[b]\nexcept:\n pass\nprint a\n\nBut is there not a simpler way to do this?\n", "output": "To remove an element's first occurrence in a list, simply use list.remove:\n>>> a = ['a', 'b', 'c', 'd']\n>>> a.remove('b')\n>>> print a\n['a', 'c', 'd']\n\nMind that it does not remove all occurrences of your element. Use a list comprehension for that.\n>>> a = [1, 2, 3, 4, 2, 3, 4, 2, 7, 2]\n>>> a = [x for x in a if x != 2]\n>>> print a\n[1, 3, 4, 3, 4, 7]\n\n"} {"input": "How to fix \"Attempted relative import in non-package\" even with __init__.py I'm trying to follow PEP 328, with the following directory structure:\npkg/\n __init__.py\n components/\n core.py\n __init__.py\n tests/\n core_test.py\n __init__.py\n\nIn core_test.py I have the following import statement\nfrom ..components.core import GameLoopEvents\n\nHowever, when I run, I get the following error:\ntests$ python core_test.py \nTraceback (most recent call last):\n File \"core_test.py\", line 3, in \n from ..components.core import GameLoopEvents\nValueError: Attempted relative import in non-package\n\nSearching around I found \"relative path not working even with __init__.py\" and \"Import a module from a relative path\" but they didn't help.\nIs there anything I'm missing here?\n", "output": "To elaborate on @Ignacio's answer:\nThe Python import mechanism works relative to the __name__ of the current file. When you execute a file directly, it doesn't have its usual name, but has \"__main__\" as its name instead. So relative imports don't work.\nYou can, as Igancio suggested, execute it using the -m option. If you have a part of your package that is meant to be run as a script, you can also use the __package__ attribute to tell that file what name it's supposed to have in the package hierarchy. \nSee http://www.python.org/dev/peps/pep-0366/ for details.\n"} {"input": "Python - Split Strings with Multiple Delimiters I think what I want to do is a fairly common task but I've found no reference on the web. I have text, with punctuation, and I want list of the words. \n\"Hey, you - what are you doing here!?\"\n\nshould be\n['hey', 'you', 'what', 'are', 'you', 'doing', 'here']\n\nBut Python's str.split() only works with one argument... So I have all words with the punctuation after I split with whitespace. Any ideas?\n", "output": "A case where regular expressions are justified:\nimport re\nDATA = \"Hey, you - what are you doing here!?\"\nprint re.findall(r\"[\\w']+\", DATA)\n# Prints ['Hey', 'you', 'what', 'are', 'you', 'doing', 'here']\n\n[Edited to include ' in the word characters - thanks, Danosaure.]\n"} {"input": "Converting unix timestamp string to readable date in Python I have a string representing a unix timestamp (i.e. \"1284101485\") in Python, and I'd like to convert it to a readable date. When I use time.strftime, I get a TypeError:\n>>>import time\n>>>print time.strftime(\"%B %d %Y\", \"1284101485\")\n\nTraceback (most recent call last):\n File \"\", line 1, in \nTypeError: argument must be 9-item sequence, not str\n\n", "output": "Use datetime module:\nimport datetime\nprint(\n datetime.datetime.fromtimestamp(\n int(\"1284101485\")\n ).strftime('%Y-%m-%d %H:%M:%S')\n)\n\nIn this code datetime.datetime can look strange, but 1st datetime is module name and 2nd is class name. So datetime.datetime.fromtimestamp() is fromtimestamp() method of datetime class from datetime module.\n"} {"input": "Argparse optional positional arguments? I have script which is meant to be used like this:\nusage: installer.py dir [-h] [-v]\ndir is a positional argument which is defined like this:\nparser.add_argument('dir', default=os.getcwd())\n\nI want dir to be optional: when it's not specified it should just be cwd.\nUnfortunately when I don't specify the dir argument, i get \"Error: Too few arguments\"\n", "output": "Try using nargs='?'\nparser.add_argument('dir', nargs='?', default=os.getcwd())\n\nextended example:\n>>> import os, argparse\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('-v', action='store_true')\n_StoreTrueAction(option_strings=['-v'], dest='v', nargs=0, const=True, default=False, type=None, choices=None, help=None, metavar=None)\n>>> parser.add_argument('dir', nargs='?', default=os.getcwd())\n_StoreAction(option_strings=[], dest='dir', nargs='?', const=None, default='/home/vinay', type=None, choices=None, help=None, metavar=None)\n>>> parser.parse_args('somedir -v'.split())\nNamespace(dir='somedir', v=True)\n>>> parser.parse_args('-v'.split())\nNamespace(dir='/home/vinay', v=True)\n>>> parser.parse_args(''.split())\nNamespace(dir='/home/vinay', v=False)\n>>> parser.parse_args(['somedir'])\nNamespace(dir='somedir', v=False)\n>>> parser.parse_args('somedir -h -v'.split())\nusage: [-h] [-v] [dir]\n\npositional arguments:\n dir\n\noptional arguments:\n -h, --help show this help message and exit\n -v\n\n"} {"input": "Python - time.clock() vs. time.time() - accuracy? Which is better to use for timing in Python? time.clock() or time.time()? Which one provides more accuracy?\nfor example:\nstart = time.clock()\n... do something\nelapsed = (time.clock() - start)\n\nvs.\nstart = time.time()\n... do something\nelapsed = (time.time() - start)\n\n", "output": "As of 3.3, time.clock() is deprecated, and it's suggested to use time.process_time() or time.perf_counter() instead.\nPreviously in 2.7, according to the time module docs:\n\ntime.clock()\nOn Unix, return the current processor time as a floating point number\n expressed in seconds. The precision, and in fact the very definition\n of the meaning of \u00e2\u0080\u009cprocessor time\u00e2\u0080\u009d, depends on that of the C function\n of the same name, but in any case, this is the function to use for\n benchmarking Python or timing algorithms.\nOn Windows, this function returns wall-clock seconds elapsed since the\n first call to this function, as a floating point number, based on the\n Win32 function QueryPerformanceCounter(). The resolution is typically\n better than one microsecond.\n\nAdditionally, there is the timeit module for benchmarking code snippets.\n"} {"input": "JSON datetime between Python and JavaScript I want to send a datetime.datetime object in serialized form from Python using JSON and de-serialize in JavaScript using JSON. What is the best way to do this?\n", "output": "You can add the 'default' parameter to json.dumps to handle this:\ndate_handler = lambda obj: (\n obj.isoformat()\n if isinstance(obj, datetime.datetime)\n or isinstance(obj, datetime.date)\n else None\n)\njson.dumps(datetime.datetime.now(), default=date_handler)\n'\"2010-04-20T20:08:21.634121\"'\n\nWhich is ISO 8601 format. \nA more comprehensive default handler function:\ndef handler(obj):\n if hasattr(obj, 'isoformat'):\n return obj.isoformat()\n elif isinstance(obj, ...):\n return ...\n else:\n raise TypeError, 'Object of type %s with value of %s is not JSON serializable' % (type(obj), repr(obj))\n\nUpdate: Added output of type as well as value.\nUpdate: Also handle date \n"} {"input": "fatal error: Python.h: No such file or directory I am trying to build a shared library using a C extension file but first I have to generate the output file using the command below:\ngcc -Wall utilsmodule.c -o Utilc\n\nAfter executing the command, I get this error message:\n\nutilsmodule.c:1:20: fatal error: Python.h: No such file or directory\n compilation terminated.\n\nin fact I have tried all the suggested solutions over the internet but the problem still exists ... also I have no problem with Python.h. I managed to locate the file on my machine ... anybody has faced the same problem before??\n", "output": "Looks like you haven't properly installed the header files and static libraries for python dev. Use your package manager to install them system-wide. \nFor apt (ubuntu, debian...):\nsudo apt-get install python-dev # for python2.x installs\nsudo apt-get install python3-dev # for python3.x installs\n\nFor yum (centos, redhat, fedora...):\nsudo yum install python-devel\n\n"} {"input": "How can I subtract a day from a Python date? I have a Python datetime.datetime object. What is the best way to subtract one day?\n", "output": "You can use a timedelta object:\nfrom datetime import datetime, timedelta\n\nd = datetime.today() - timedelta(days=days_to_subtract)\n\n"} {"input": "Convert date to datetime in Python Silly question, but is there a built-in method for converting a date to a datetime in Python, ie. getting the datetime for the midnight of the date? The opposite conversion is easy - datetime has a .date() method. Do I really have to manually call datetime(d.year, d.month, d.day) ?\n", "output": "You can use datetime.combine(date, time); for the time, you create a datetime.time object initialized to midnight.\nfrom datetime import date\nfrom datetime import datetime\nd = date.today()\ndatetime.combine(d, datetime.min.time())\n\n"} {"input": "List filtering: list comprehension vs. lambda + filter I happened to find myself having a basic filtering need: I have a list and I have to filter it by an attribute of the items.\nMy code looked like this:\nmy_list = [x for x in my_list if x.attribute == value]\n\nBut then I thought, wouldn't it be better to write it like this?\nmy_list = filter(lambda x: x.attribute == value, my_list)\n\nIt's more readable, and if needed for performance the lambda could be taken out to gain something. \nQuestion is: are there any caveats in using the second way? Any performance difference? Am I missing the Pythonic Way\u00e2\u0084\u00a2 entirely and should do it in yet another way (such as using itemgetter instead of the lambda)?\n", "output": "It is strange how much beauty varies for different people. I find the list comprehension much clearer than the ugly filter+lambda, but use whichever you find easier. However, do stop giving your variables names already used for builtins, that's just ugly, and not open for discussion.\nThere are two things that may slow down your use of filter.\nThe first is the function call overhead: as soon as you use a Python function (whether created by def or lambda) it is likely that filter will be slower than the list comprehension. It almost certainly is not enough to matter, and you shouldn't think much about performance until you've timed your code and found it to be a bottleneck, but the difference will be there.\nThe other overhead that might apply is that the lambda is being forced to access a scoped variable (value). That is slower than accessing a local variable and in Python 2.x the list comprehension only accesses local variables. If you are using Python 3.x the list comprehension runs in a separate function so it will also be accessing value through a closure and this difference won't apply.\nThe other option to consider is to use a generator instead of a list comprehension:\ndef filterbyvalue(seq, value):\n for el in seq:\n if el.attribute==value: yield el\n\nThen in your main code (which is where readability really matters) you've replaced both list comprehension and filter with a hopefully meaningful function name.\n"} {"input": "Converting a String to Dictionary? How can I convert the str representation of a dict, such as the following string, into a dict?\ns = \"{'muffin' : 'lolz', 'foo' : 'kitty'}\"\n\nI prefer not to use eval. What else can I use?\nThe main reason for this, is one of my coworkers classes he wrote, converts all input into strings. I'm not in the mood to go and modify his classes, to deal with this issue.\n", "output": "Starting in Python 2.6 you can use the built-in ast.literal_eval:\n>>> import ast\n>>> ast.literal_eval(\"{'muffin' : 'lolz', 'foo' : 'kitty'}\")\n{'muffin': 'lolz', 'foo': 'kitty'}\n\nThis is safer than using eval. As its own docs say:\n\n>>> help(ast.literal_eval)\nHelp on function literal_eval in module ast:\n\nliteral_eval(node_or_string)\n Safely evaluate an expression node or a string containing a Python\n expression. The string or node provided may only consist of the following\n Python literal structures: strings, numbers, tuples, lists, dicts, booleans,\n and None.\n\nFor example:\n>>> eval(\"shutil.rmtree('mongo')\")\nTraceback (most recent call last):\n File \"\", line 1, in \n File \"\", line 1, in \n File \"/opt/Python-2.6.1/lib/python2.6/shutil.py\", line 208, in rmtree\n onerror(os.listdir, path, sys.exc_info())\n File \"/opt/Python-2.6.1/lib/python2.6/shutil.py\", line 206, in rmtree\n names = os.listdir(path)\nOSError: [Errno 2] No such file or directory: 'mongo'\n>>> ast.literal_eval(\"shutil.rmtree('mongo')\")\nTraceback (most recent call last):\n File \"\", line 1, in \n File \"/opt/Python-2.6.1/lib/python2.6/ast.py\", line 68, in literal_eval\n return _convert(node_or_string)\n File \"/opt/Python-2.6.1/lib/python2.6/ast.py\", line 67, in _convert\n raise ValueError('malformed string')\nValueError: malformed string\n\n"} {"input": "open() in Python does not create a file if it doesn't exist What is the best way to open a file as read/write if it exists, or if it does not, then create it and open it as read/write? From what I read, file = open('myfile.dat', 'rw') should do this, right?\nIt is not working for me (Python 2.6.2) and I'm wondering if it is a version problem, or not supposed to work like that or what.\nThe bottom line is, I just need a solution for the problem. I am curious about the other stuff, but all I need is a nice way to do the opening part.\nUPDATE: the enclosing directory was writeable by user and group, not other (I'm on a Linux system... so permissions 775 in other words), and the exact error was:\n\nIOError: no such file or directory.\n\n", "output": "You should use file = open('myfile.dat', 'w+')\n"} {"input": "Non-blocking read on a subprocess.PIPE in python I'm using the subprocess module to start a subprocess and connect to it's output stream (stdout). I want to be able to execute non-blocking reads on its stdout. Is there a way to make .readline non-blocking or to check if there is data on the stream before I invoke .readline? I'd like this to be portable or at least work under Windows and Linux.\nhere is how I do it for now (It's blocking on the .readline if no data is avaible):\np = subprocess.Popen('myprogram.exe', stdout = subprocess.PIPE)\noutput_str = p.stdout.readline()\n\n", "output": "fcntl, select, asyncproc won't help in this case.\nA reliable way to read a stream without blocking regardless of operating system is to use Queue.get_nowait():\nimport sys\nfrom subprocess import PIPE, Popen\nfrom threading import Thread\n\ntry:\n from Queue import Queue, Empty\nexcept ImportError:\n from queue import Queue, Empty # python 3.x\n\nON_POSIX = 'posix' in sys.builtin_module_names\n\ndef enqueue_output(out, queue):\n for line in iter(out.readline, b''):\n queue.put(line)\n out.close()\n\np = Popen(['myprogram.exe'], stdout=PIPE, bufsize=1, close_fds=ON_POSIX)\nq = Queue()\nt = Thread(target=enqueue_output, args=(p.stdout, q))\nt.daemon = True # thread dies with the program\nt.start()\n\n# ... do other things here\n\n# read line without blocking\ntry: line = q.get_nowait() # or q.get(timeout=.1)\nexcept Empty:\n print('no output yet')\nelse: # got line\n # ... do something with line\n\n"} {"input": "What is the standard Python docstring format? I have seen a few different styles of writing docstrings in Python, is there an official or \"agreed-upon\" style?\n", "output": "The Google style guide contains an excellent Python style guide. It includes conventions for readable docstring syntax that offers better guidance than PEP-257. For example:\ndef square_root(n):\n \"\"\"Calculate the square root of a number.\n\n Args:\n n: the number to get the square root of.\n Returns:\n the square root of n.\n Raises:\n TypeError: if n is not a number.\n ValueError: if n is negative.\n\n \"\"\"\n pass\n\nI like to extend this to also include type information in the arguments, as described in this Sphinx documentation tutorial. For example:\ndef add_value(self, value):\n \"\"\"Add a new value.\n\n Args:\n value (str): the value to add.\n \"\"\"\n pass\n\n"} {"input": "What is the naming convention in Python for variable and function names? Coming from a C# background the naming convention for variables and method names are usually either CamelCase or Pascal Case:\n// C# example\nstring thisIsMyVariable = \"a\"\npublic void ThisIsMyMethod()\n\nIn Python, I have seen the above but I have also seen underscores being used:\n# python example\nthis_is_my_variable = 'a'\ndef this_is_my_function():\n\nIs there a more preferable, definitive coding style for Python?\n", "output": "See Python PEP 8.\n\nFunction names should be lowercase,\n with words separated by underscores as\n necessary to improve readability.\nmixedCase is allowed only in contexts\n where that's already the prevailing\n style\n\nVariables...\n\nUse the function naming rules:\n lowercase with words separated by\n underscores as necessary to improve\n readability.\n\nPersonally, I deviate from this because I also prefer mixedCase over lower_case for my own projects.\n"} {"input": "Remove empty strings from a list of strings I want to remove all empty strings from a list of strings in python.\nMy idea looks like this:\nwhile '' in str_list:\n str_list.remove('')\n\nIs there any more pythonic way to do this?\n", "output": "I would use filter:\nstr_list = filter(None, str_list) # fastest\nstr_list = filter(bool, str_list) # fastest\nstr_list = filter(len, str_list) # a bit of slower\nstr_list = filter(lambda item: item, str_list) # slower than list comprehension\n\nPython 3 returns an iterator from filter, so should be wrapped in a call to list()\nstr_list = list(filter(None, str_list)) # fastest\n\n(etc.)\nTests:\n>>> timeit('filter(None, str_list)', 'str_list=[\"a\"]*1000', number=100000)\n2.4797441959381104\n>>> timeit('filter(bool, str_list)', 'str_list=[\"a\"]*1000', number=100000)\n2.4788150787353516\n>>> timeit('filter(len, str_list)', 'str_list=[\"a\"]*1000', number=100000)\n5.2126238346099854\n>>> timeit('[x for x in str_list if x]', 'str_list=[\"a\"]*1000', number=100000)\n13.354584932327271\n>>> timeit('filter(lambda item: item, str_list)', 'str_list=[\"a\"]*1000', number=100000)\n17.427681922912598\n\n"} {"input": "Convert Python dict to object? I'm searching for an elegant way to convert a normal Python dict with some nested dicts to an object.\nFor example:\n>>> d = {'a': 1, 'b': {'c': 2}, 'd': [\"hi\", {'foo': \"bar\"}]}\n\nShould be accessible in this way:\n>>> x = dict2obj(d)\n>>> x.a\n1\n>>> x.b.c\n2\n>>> x.d[1].foo\nbar\n\nI think, this is not possible without recursion, but what would be a nice way to get an objectstyle for dicts?\nThank you in advance.\n", "output": "Update: In Python 2.6 and onwards, consider whether the namedtuple data structure suits your needs:\n>>> from collections import namedtuple\n>>> MyStruct = namedtuple('MyStruct', 'a b d')\n>>> s = MyStruct(a=1, b={'c': 2}, d=['hi'])\n>>> s\nMyStruct(a=1, b={'c': 2}, d=['hi'])\n>>> s.a\n1\n>>> s.b\n{'c': 2}\n>>> s.c\n>>> s.d\n['hi']\n\nThe alternative (original answer contents) is:\nclass Struct:\n def __init__(self, **entries): \n self.__dict__.update(entries)\n\nThen, you can use:\n>>> args = {'a': 1, 'b': 2}\n>>> s = Struct(**args)\n>>> s\n<__main__.Struct instance at 0x01D6A738>\n>>> s.a\n1\n>>> s.b\n2\n\n"} {"input": "Is there any pythonic way to combine two dicts (adding values for keys that appear in both)? For example I have two dicts:\nDict A: {'a':1, 'b':2, 'c':3}\nDict B: {'b':3, 'c':4, 'd':5}\n\nI need a pythonic way of 'combining' two dicts such that the result is :\n{'a':1, 'b':5, 'c':7, 'd':5}\n\nThat is to say: if a key appears in both dicts, add their values, if it appears in only one dict, keep its value.\n", "output": "Use collections.Counter:\n>>> from collections import Counter\n>>> A = Counter({'a':1, 'b':2, 'c':3})\n>>> B = Counter({'b':3, 'c':4, 'd':5})\n>>> A + B\nCounter({'c': 7, 'b': 5, 'd': 5, 'a': 1})\n\nCounters are basically a subclass of dict, so you can still do everything else with them you'd normally do with that type, such as iterate over their keys and values.\n"} {"input": "how to detect whether a python variable is a function? I have a variable, x, and I want to know whether it is pointing to a function or not.\nI had hoped I could do something like:\n>>> isinstance(x, function)\n\nBut that gives me:\nTraceback (most recent call last):\n File \"\", line 1, in ?\nNameError: name 'function' is not defined\n\nThe reason I picked that is because \n>>> type(x)\n\n\n", "output": "If this is for Python 2.x or for Python 3.2+, you can also use callable(). It used to be deprecated, but is now undeprecated, so you can use it again. You can read the discussion here: http://bugs.python.org/issue10518. You can do this with:\ncallable(obj)\n\nIf this is for Python 3.x but before 3.2, check if the object has a __call__ attribute. You can do this with:\nhasattr(obj, '__call__')\n\nThe oft-suggested types.FunctionTypes approach is not correct because it fails to cover many cases that you would presumably want it to pass, like with builtins:\n>>> isinstance(open, types.FunctionType)\nFalse\n\n>>> callable(open)\nTrue\n\nThe proper way to check properties of duck-typed objects is to ask them if they quack, not to see if they fit in a duck-sized container. Don't use types.FunctionType unless you have a very specific idea of what a function is.\n"} {"input": "Build a Basic Python Iterator How would one create an iterative function (or iterator object) in python?\n", "output": "Iterator objects in python conform to the iterator protocol, which basically means they provide two methods: __iter__() and next(). The __iter__ returns the iterator object and is implicitly called at the start of loops. The next() method returns the next value and is implicitly called at each loop increment. next() raises a StopIteration exception when there are no more value to return, which is implicitly captured by looping constructs to stop iterating.\nHere's a simple example of a counter:\nclass Counter:\n def __init__(self, low, high):\n self.current = low\n self.high = high\n\n def __iter__(self):\n return self\n\n def next(self): # Python 3: def __next__(self)\n if self.current > self.high:\n raise StopIteration\n else:\n self.current += 1\n return self.current - 1\n\n\nfor c in Counter(3, 8):\n print c\n\nThis will print:\n3\n4\n5\n6\n7\n8\n\nThis is easier to write using a generator, as covered in a previous answer:\ndef counter(low, high):\n current = low\n while current <= high:\n yield current\n current += 1\n\nfor c in counter(3, 8):\n print c\n\nThe printed output will be the same. Under the hood, the generator object supports the iterator protocol and does something roughly similar to the class Counter.\nDavid Mertz's article, Iterators and Simple Generators, is a pretty good introduction. \n"} {"input": "How can I tell if a string repeats itself in Python? I'm looking for a way to test whether or not a given string repeats itself for the entire string or not.\nExamples:\n[\n '0045662100456621004566210045662100456621', # '00456621'\n '0072992700729927007299270072992700729927', # '00729927'\n '001443001443001443001443001443001443001443', # '001443'\n '037037037037037037037037037037037037037037037', # '037'\n '047619047619047619047619047619047619047619', # '047619'\n '002457002457002457002457002457002457002457', # '002457'\n '001221001221001221001221001221001221001221', # '001221'\n '001230012300123001230012300123001230012300123', # '00123'\n '0013947001394700139470013947001394700139470013947', # '0013947'\n '001001001001001001001001001001001001001001001001001', # '001'\n '001406469760900140646976090014064697609', # '0014064697609'\n]\n\nare strings which repeat themselves, and\n[\n '004608294930875576036866359447',\n '00469483568075117370892018779342723',\n '004739336492890995260663507109',\n '001508295625942684766214177978883861236802413273',\n '007518796992481203',\n '0071942446043165467625899280575539568345323741',\n '0434782608695652173913',\n '0344827586206896551724137931',\n '002481389578163771712158808933',\n '002932551319648093841642228739',\n '0035587188612099644128113879',\n '003484320557491289198606271777',\n '00115074798619102416570771',\n]\n\nare examples of ones that do not.\nThe repeating sections of the strings I'm given can be quite long, and the strings themselves can be 500 or more characters, so looping through each character trying to build a pattern then checking the pattern vs the rest of the string seems awful slow. Multiply that by potentially hundreds of strings and I can't see any intuitive solution.\nI've looked into regexes a bit and they seem good for when you know what you're looking for, or at least the length of the pattern you're looking for. Unfortunately, I know neither.\nHow can I tell if a string is repeating itself and if it is, what the shortest repeating subsequence is?\n", "output": "Here's a concise solution which avoids regular expressions and slow in-Python loops:\ndef principal_period(s):\n i = (s+s).find(s, 1, -1)\n return None if i == -1 else s[:i]\n\nSee the Community Wiki answer started by @davidism for benchmark results. In summary,\n\nDavid Zhang's solution is the clear winner, outperforming all others by at least 5x for the large example set.\n\n(That answer's words, not mine.)\nThis is based on the observation that a string is periodic if and only if it is equal to a nontrivial rotation of itself. Kudos to @AleksiTorhamo for realizing that we can then recover the principal period from the index of the first occurrence of s in (s+s)[1:-1], and for informing me of the optional start and end arguments of Python's string.find.\n"} {"input": "Good way to append to a string I want an efficient way to append string to another.\nIs there any good built-in method to use?\n", "output": "If you only have one reference to a string and you concatenate another string to the end, CPython now special cases this and tries to extend the string in place.\nThe end result is that the operation is amortized O(n)\neg\ns = \"\"\nfor i in range(n):\n s+=str(n)\n\nused to be O(n^2), but now it is O(n)\nFrom the source (bytesobject.c)\nvoid\nPyBytes_ConcatAndDel(register PyObject **pv, register PyObject *w)\n{\n PyBytes_Concat(pv, w);\n Py_XDECREF(w);\n}\n\n\n/* The following function breaks the notion that strings are immutable:\n it changes the size of a string. We get away with this only if there\n is only one module referencing the object. You can also think of it\n as creating a new string object and destroying the old one, only\n more efficiently. In any case, don't use this if the string may\n already be known to some other part of the code...\n Note that if there's not enough memory to resize the string, the original\n string object at *pv is deallocated, *pv is set to NULL, an \"out of\n memory\" exception is set, and -1 is returned. Else (on success) 0 is\n returned, and the value in *pv may or may not be the same as on input.\n As always, an extra byte is allocated for a trailing \\0 byte (newsize\n does *not* include that), and a trailing \\0 byte is stored.\n*/\n\nint\n_PyBytes_Resize(PyObject **pv, Py_ssize_t newsize)\n{\n register PyObject *v;\n register PyBytesObject *sv;\n v = *pv;\n if (!PyBytes_Check(v) || Py_REFCNT(v) != 1 || newsize < 0) {\n *pv = 0;\n Py_DECREF(v);\n PyErr_BadInternalCall();\n return -1;\n }\n /* XXX UNREF/NEWREF interface should be more symmetrical */\n _Py_DEC_REFTOTAL;\n _Py_ForgetReference(v);\n *pv = (PyObject *)\n PyObject_REALLOC((char *)v, PyBytesObject_SIZE + newsize);\n if (*pv == NULL) {\n PyObject_Del(v);\n PyErr_NoMemory();\n return -1;\n }\n _Py_NewReference(*pv);\n sv = (PyBytesObject *) *pv;\n Py_SIZE(sv) = newsize;\n sv->ob_sval[newsize] = '\\0';\n sv->ob_shash = -1; /* invalidate cached hash value */\n return 0;\n}\n\nIt's easy enough to verify empirically\n\n$ python -m timeit -s\"s=''\" \"for i in xrange(10):s+='a'\"\n1000000 loops, best of 3: 1.85 usec per loop\n$ python -m timeit -s\"s=''\" \"for i in xrange(100):s+='a'\"\n10000 loops, best of 3: 16.8 usec per loop\n$ python -m timeit -s\"s=''\" \"for i in xrange(1000):s+='a'\"\n10000 loops, best of 3: 158 usec per loop\n$ python -m timeit -s\"s=''\" \"for i in xrange(10000):s+='a'\"\n1000 loops, best of 3: 1.71 msec per loop\n$ python -m timeit -s\"s=''\" \"for i in xrange(100000):s+='a'\"\n10 loops, best of 3: 14.6 msec per loop\n$ python -m timeit -s\"s=''\" \"for i in xrange(1000000):s+='a'\"\n10 loops, best of 3: 173 msec per loop\n\nIt's important however to note that this optimisation isn't part of the Python spec. It's only in the cPython implementation as far as I know. The same empirical testing on pypy or jython for example might show the older O(n**2) performance \n\n$ pypy -m timeit -s\"s=''\" \"for i in xrange(10):s+='a'\"\n10000 loops, best of 3: 90.8 usec per loop\n$ pypy -m timeit -s\"s=''\" \"for i in xrange(100):s+='a'\"\n1000 loops, best of 3: 896 usec per loop\n$ pypy -m timeit -s\"s=''\" \"for i in xrange(1000):s+='a'\"\n100 loops, best of 3: 9.03 msec per loop\n$ pypy -m timeit -s\"s=''\" \"for i in xrange(10000):s+='a'\"\n10 loops, best of 3: 89.5 msec per loop\n\nSo far so good, but then\n\n$ pypy -m timeit -s\"s=''\" \"for i in xrange(100000):s+='a'\"\n10 loops, best of 3: 12.8 sec per loop\n\nouch even worse than quadratic. So pypy is doing something that works well with short strings, but performs poorly for larger strings\n"} {"input": "How to print date in a regular format in Python? This is my code:\nimport datetime\ntoday = datetime.date.today()\nprint today\n\nThis prints: 2008-11-22 which is exactly what I want BUT....I have a list I'm appending this to and then suddenly everything goes \"wonky\". Here is the code:\nimport datetime\nmylist = []\ntoday = datetime.date.today()\nmylist.append(today)\nprint mylist\n\nThis prints the following: \n[datetime.date(2008, 11, 22)]\n\nHow on earth can I get just a simple date like \"2008-11-22\"?\n", "output": "The WHY: dates are objects\nIn Python, dates are objects. Therefore, when you manipulate them, you manipulate objects, not strings, not timestamps nor anything.\nAny object in Python have TWO string representations:\n\nThe regular representation that is used by \"print\", can be get using the str() function. It is most of the time the most common human readable format and is used to ease display. So str(datetime.datetime(2008, 11, 22, 19, 53, 42)) gives you '2008-11-22 19:53:42'. \nThe alternative representation that is used to represent the object nature (as a data). It can be get using the repr() function and is handy to know what kind of data your manipulating while you are developing or debugging. repr(datetime.datetime(2008, 11, 22, 19, 53, 42)) gives you 'datetime.datetime(2008, 11, 22, 19, 53, 42)'.\n\nWhat happened is that when you have printed the date using \"print\", it used str() so you could see a nice date string. But when you have printed mylist, you have printed a list of objects and Python tried to represent the set of data, using repr().\nThe How: what do you want to do with that?\nWell, when you manipulate dates, keep using the date objects all long the way. They got thousand of useful methods and most of the Python API expect dates to be objects.\nWhen you want to display them, just use str(). In Python, the good practice is to explicitly cast everything. So just when it's time to print, get a string representation of your date using str(date).\nOne last thing. When you tried to print the dates, you printed mylist. If you want to print a date, you must print the date objects, not their container (the list).\nE.G, you want to print all the date in a list :\nfor date in mylist :\n print str(date)\n\nNote that in that specific case, you can even omit str() because print will use it for you. But it should not become a habit :-)\nPractical case, using your code\nimport datetime\nmylist = []\ntoday = datetime.date.today()\nmylist.append(today)\nprint mylist[0] # print the date object, not the container ;-)\n2008-11-22\n\n# It's better to always use str() because :\n\nprint \"This is a new day : \", mylist[0] # will work\nThis is a new day : 2008-11-22\n\nprint \"This is a new day : \" + mylist[0] # will crash\ncannot concatenate 'str' and 'datetime.date' objects\n\nprint \"This is a new day : \" + str(mylist[0]) \nThis is a new day : 2008-11-22\n\nAdvanced date formatting\nDates have a default representation, but you may want to print them in a specific format. In that case, you can get a custom string representation using the strftime() method.\nstrftime() expects a string pattern explaining how you want to format your date.\nE.G : \nprint today.strftime('We are the %d, %b %Y')\n'We are the 22, Nov 2008'\n\nAll the letter after a \"%\" represent a format for something :\n\n%d is the day number\n%m is the month number\n%b is the month abbreviation\n%y is the year last two digits\n%Y is the all year\n\netc\nHave a look at the official documentation, or McCutchen's quick reference you can't know them all.\nSince PEP3101, every object can have its own format used automatically by the method format of any string. In the case of the datetime, the format is the same used in\nstrftime. So you can do the same as above like this:\nprint \"We are the {:%d, %b %Y}\".format(today)\n'We are the 22, Nov 2008'\n\nThe advantage of this form is that you can also convert other objects at the same time.\nLocalization\nDates can automatically adapt to the local language and culture if you use them the right way, but it's a bit complicated. Maybe for another question on SO(Stack Overflow) ;-)\n"} {"input": "Use a Glob() to find files recursively in Python? This is what I have:\nglob(os.path.join('src','*.c'))\n\nbut I want to search the subfolders of src. Something like this would work:\nglob(os.path.join('src','*.c'))\nglob(os.path.join('src','*','*.c'))\nglob(os.path.join('src','*','*','*.c'))\nglob(os.path.join('src','*','*','*','*.c'))\n\nBut this is obviously limited and clunky.\n", "output": "Python 3.5+\nStarting with Python version 3.5, the glob module supports the \"**\" directive (which is parsed only if you pass recursive flag):\nimport glob\n\nfor filename in glob.iglob('src/**/*.c', recursive=True):\n print(filename)\n\nIf you need an list, just use glob.glob instead of glob.iglob.\nPython 2.2 to 3.4\nFor older Python versions, starting with Python 2.2, use os.walk to recursively walk a directory and fnmatch.filter to match against a simple expression:\nimport fnmatch\nimport os\n\nmatches = []\nfor root, dirnames, filenames in os.walk('src'):\n for filename in fnmatch.filter(filenames, '*.c'):\n matches.append(os.path.join(root, filename))\n\nPython 2.1 and earlier\nFor even older Python versions, use glob.glob against each filename instead of fnmatch.filter.\n"} {"input": "Python's use of __new__ and __init__? I'm just trying to streamline one of my classes and have introduced some functionality in the same style as the flyweight design pattern.\nHowever, I'm a bit confused as to why __init__ is always called after __new__. I wasn't expecting this. Can anyone tell me why this is happening and how I implement this functionality otherwise? (apart from putting the implementation into the __new__ which feels quite hacky).\nHere's an example:\nclass A(object):\n _dict = dict()\n\n def __new__(cls):\n \tif 'key' in A._dict:\n \t\tprint \"EXISTS\"\n \t\treturn A._dict['key']\n \telse:\n \t\tprint \"NEW\"\n \t\treturn super(A, cls).__new__(cls)\n\n def __init__(self):\n \tprint \"INIT\"\n \tA._dict['key'] = self\n \tprint \"\"\n\na1 = A()\na2 = A()\na3 = A()\n\nOutputs:\nNEW\nINIT\n\nEXISTS\nINIT\n\nEXISTS\nINIT\n\nWhy?\n", "output": "\nUse __new__ when you need to control\n the creation of a new instance. Use\n __init__ when you need to control initialization of a new instance.\n__new__ is the first step of instance creation. It's called first, and is\n responsible for returning a new\n instance of your class. In contrast,\n __init__ doesn't return anything; it's only responsible for initializing the\n instance after it's been created.\nIn general, you shouldn't need to\n override __new__ unless you're\n subclassing an immutable type like\n str, int, unicode or tuple.\n\nFrom: http://mail.python.org/pipermail/tutor/2008-April/061426.html\nYou should consider that what you are trying to do is usually done with a Factory and that's the best way to do it. Using __new__ is not a good clean solution so please consider the usage of a factory. Here you have a good factory example.\n"} {"input": "How do I install a Python package with a .whl file? I'm having trouble installing a Python package (specifically, JPype1 0.5.7) on my Windows machine, and would like to install it with Christoph Gohlke's Window binaries. (Which, to my experience, alleviated much of the fuss for many other package installations.)\nHowever, while Christoph used to provide .exe files in the past, recently it seems he's uploading .whl files instead.\nhttp://www.lfd.uci.edu/~gohlke/pythonlibs/#jpype\nBut how do I install .whl files?\n\nNotes:\n\nI've found documents on wheel, but they don't seem so staightforward in explaining how to install .whl files.\nThis question is a duplicate with this question, which wasn't directly answered.\n\n\n", "output": "I just used the following which was quite simple. First open a console then cd to where you've downloaded your file like some-package.whl and use\npip install some-package.whl\n\nNote: if pip.exe is not recognized, you may find it in the \"Scripts\" directory from where python has been installed. If pip is not installed, this page can help:\nHow do I install pip on Windows?\nNote: for clarification\nIf you copy the *.whl file to your local drive (ex. C:\\some-dir\\some-file.whl) use the following command line parameters -- \npip install C:/some-dir/some-file.whl\n\n"} {"input": "What is the Python equivalent of static variables inside a function? What is the idiomatic Python equivalent of this C/C++ code?\nvoid foo()\n{\n static int counter = 0;\n counter++;\n printf(\"counter is %d\\n\", counter);\n}\n\nspecifically, how does one implement the static member at the function level, as opposed to the class level? And does placing the function into a class change anything?\n", "output": "A bit reversed, but this should work:\ndef foo():\n foo.counter += 1\n print \"Counter is %d\" % foo.counter\nfoo.counter = 0\n\nIf you want the counter initialization code at the top instead of the bottom, you can create a decorator:\ndef static_var(varname, value):\n def decorate(func):\n setattr(func, varname, value)\n return func\n return decorate\n\nThen use the code like this:\n@static_var(\"counter\", 0)\ndef foo():\n foo.counter += 1\n print \"Counter is %d\" % foo.counter\n\nIt'll still require you to use the foo. prefix, unfortunately.\n\nEDIT (thanks to ony): This looks even nicer:\ndef static_vars(**kwargs):\n def decorate(func):\n for k in kwargs:\n setattr(func, k, kwargs[k])\n return func\n return decorate\n\n@static_vars(counter=0)\ndef foo():\n foo.counter += 1\n print \"Counter is %d\" % foo.counter\n\n"} {"input": "How to make a class JSON serializable How to make a Python class serializable? \nA simple class:\nclass FileItem:\n def __init__(self, fname):\n self.fname = fname\n\nWhat should I do to be able to get output of:\njson.dumps()\n\nWithout an error (FileItem instance at ... is not JSON serializable)\n", "output": "Here is a simple solution for a simple feature:\n.toJSON() Method\nInstead of a JSON serializable class, implement a serializer method:\nimport json\n\nclass Object:\n def toJSON(self):\n return json.dumps(self, default=lambda o: o.__dict__, \n sort_keys=True, indent=4)\n\nSo you just call it to serialize:\nme = Object()\nme.name = \"Onur\"\nme.age = 35\nme.dog = Object()\nme.dog.name = \"Apollo\"\n\nprint(me.toJSON())\n\nwill output:\n{\n \"age\": 35,\n \"dog\": {\n \"name\": \"Apollo\"\n },\n \"name\": \"Onur\"\n}\n\n"} {"input": "How to parse an ISO 8601-formatted date in Python? I need to parse RFC 3339 strings like \"2008-09-03T20:56:35.450686Z\" into Python's datetime type.\nI have found strptime in the Python standard library, but it is not very convenient.\nWhat is the best way to do this?\n", "output": "The python-dateutil package can parse not only RFC 3339 datetime strings like the one in the question, but also other ISO 8601 date and time strings that don't comply with RFC 3339 (such as ones with no UTC offset, or ones that represent only a date).\n>>> import dateutil.parser\n>>> dateutil.parser.parse('2008-09-03T20:56:35.450686Z') # RFC 3339 format\ndatetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=tzutc())\n>>> dateutil.parser.parse('2008-09-03T20:56:35.450686') # ISO 8601 extended format\ndatetime.datetime(2008, 9, 3, 20, 56, 35, 450686)\n>>> dateutil.parser.parse('20080903T205635.450686') # ISO 8601 basic format\ndatetime.datetime(2008, 9, 3, 20, 56, 35, 450686)\n>>> dateutil.parser.parse('20080903') # ISO 8601 basic format, date only\ndatetime.datetime(2008, 9, 3, 0, 0)\n\n"} {"input": "How do you get the logical xor of two variables in Python? How do you get the logical xor of two variables in Python?\nFor example, I have two variables that I expect to be strings. I want to test that only one of them contains a True value (is not None or the empty string):\nstr1 = raw_input(\"Enter string one:\")\nstr2 = raw_input(\"Enter string two:\")\nif logical_xor(str1, str2):\n print \"ok\"\nelse:\n print \"bad\"\n\nThe ^ operator seems to be bitwise, and not defined on all objects:\n>>> 1 ^ 1\n0\n>>> 2 ^ 1\n3\n>>> \"abc\" ^ \"\"\nTraceback (most recent call last):\n File \"\", line 1, in \nTypeError: unsupported operand type(s) for ^: 'str' and 'str'\n\n", "output": "If you're already normalizing the inputs to booleans, then != is xor.\nbool(a) != bool(b)\n\n"} {"input": "How to sort a list of objects in Python, based on an attribute of the objects? I've got a list of Python objects that I'd like to sort by an attribute of the objects themselves. The list looks like:\n>>> ut\n[, , , , , , ...]\n\nEach object has a count:\n>>> ut[1].count\n1L\n\nI need to sort the list by number of counts descending.\nI've seen several methods for this, but I'm looking for best practice in Python.\n", "output": "# To sort the list in place...\nut.sort(key=lambda x: x.count, reverse=True)\n\n# To return a new list, use the sorted() built-in function...\nnewlist = sorted(ut, key=lambda x: x.count, reverse=True)\n\nMore on sorting by keys \u00bb \n"} {"input": "How to put the legend out of the plot I have a series of 20 plots (not subplots) to be made in a single figure. I want the legend to be outside of the box. At the same time, I do not want to change the axes, as the size of the figure gets reduced. Kindly help me for the following queries:\n\nI want to keep the legend box outside the plot area. (I want the legend to be outside at the right side of the plot area).\nIs there anyway that I reduce the font size of the text inside the legend box, so that the size of the legend box will be small.\n\n", "output": "There are a number of ways to do what you want. To add to what @inalis and @Navi already said, you can use the bbox_to_anchor keyword argument to place the legend partially outside the axes and/or decrease the font size. \nBefore you consider decreasing the font size (which can make things awfully hard to read), try playing around with placing the legend in different places:\nSo, let's start with a generic example:\nimport matplotlib.pyplot as plt\nimport numpy as np\n\nx = np.arange(10)\n\nfig = plt.figure()\nax = plt.subplot(111)\n\nfor i in xrange(5):\n ax.plot(x, i * x, label='$y = %ix$' % i)\n\nax.legend()\n\nplt.show()\n\n\nIf we do the same thing, but use the bbox_to_anchor keyword argument we can shift the legend slightly outside the axes boundaries:\nimport matplotlib.pyplot as plt\nimport numpy as np\n\nx = np.arange(10)\n\nfig = plt.figure()\nax = plt.subplot(111)\n\nfor i in xrange(5):\n ax.plot(x, i * x, label='$y = %ix$' % i)\n\nax.legend(bbox_to_anchor=(1.1, 1.05))\n\nplt.show()\n\n\nSimilarly, you can make the legend more horizontal and/or put it at the top of the figure (I'm also turning on rounded corners and a simple drop shadow):\nimport matplotlib.pyplot as plt\nimport numpy as np\n\nx = np.arange(10)\n\nfig = plt.figure()\nax = plt.subplot(111)\n\nfor i in xrange(5):\n line, = ax.plot(x, i * x, label='$y = %ix$'%i)\n\nax.legend(loc='upper center', bbox_to_anchor=(0.5, 1.05),\n ncol=3, fancybox=True, shadow=True)\nplt.show()\n\n\nAlternatively, you can shrink the current plot's width, and put the legend entirely outside the axis of the figure:\nimport matplotlib.pyplot as plt\nimport numpy as np\n\nx = np.arange(10)\n\nfig = plt.figure()\nax = plt.subplot(111)\n\nfor i in xrange(5):\n ax.plot(x, i * x, label='$y = %ix$'%i)\n\n# Shrink current axis by 20%\nbox = ax.get_position()\nax.set_position([box.x0, box.y0, box.width * 0.8, box.height])\n\n# Put a legend to the right of the current axis\nax.legend(loc='center left', bbox_to_anchor=(1, 0.5))\n\nplt.show()\n\n\nAnd in a similar manner, you can shrink the plot vertically, and put the a horizontal legend at the bottom:\nimport matplotlib.pyplot as plt\nimport numpy as np\n\nx = np.arange(10)\n\nfig = plt.figure()\nax = plt.subplot(111)\n\nfor i in xrange(5):\n line, = ax.plot(x, i * x, label='$y = %ix$'%i)\n\n# Shrink current axis's height by 10% on the bottom\nbox = ax.get_position()\nax.set_position([box.x0, box.y0 + box.height * 0.1,\n box.width, box.height * 0.9])\n\n# Put a legend below current axis\nax.legend(loc='upper center', bbox_to_anchor=(0.5, -0.05),\n fancybox=True, shadow=True, ncol=5)\n\nplt.show()\n\n\nHave a look at the matplotlib legend guide. You might also take a look at plt.figlegend(). Hope that helps a bit, anyway!\n"} {"input": "Traverse a list in reverse order in Python So I can start from len(collection) and end in collection[0].\nEDIT: Sorry, I forgot to mention I also want to be able to access the loop index.\n", "output": "Use the reversed() built-in function:\n>>> a = [\"foo\", \"bar\", \"baz\"]\n>>> for i in reversed(a):\n... print i\n... \nbaz\nbar\nfoo\n\nTo also access the original index:\n>>> for i, e in reversed(list(enumerate(a))):\n... print i, e\n... \n2 baz\n1 bar\n0 foo\n\n"} {"input": "How can I use Python to get the system hostname? I'm writing a chat program for a local network. I would like be able to identify computers and get the user-set computer name with Python.\n", "output": "Use socket and its gethostname() functionality. This will get the hostname of the computer where the Python interpreter is running:\nimport socket\nprint(socket.gethostname())\n\n"} {"input": "Favorite Django Tips & Features? Inspired by the question series 'Hidden features of ...', I am curious to hear about your favorite Django tips or lesser known but useful features you know of.\n\nPlease, include only one tip per answer.\nAdd Django version requirements if there are any.\n\n", "output": "I'm just going to start with a tip from myself :)\nUse os.path.dirname() in settings.py to avoid hardcoded dirnames.\nDon't hardcode path's in your settings.py if you want to run your project in different locations. Use the following code in settings.py if your templates and static files are located within the Django project directory:\n# settings.py\nimport os\nPROJECT_DIR = os.path.dirname(__file__)\n...\nSTATIC_DOC_ROOT = os.path.join(PROJECT_DIR, \"static\")\n...\nTEMPLATE_DIRS = (\n os.path.join(PROJECT_DIR, \"templates\"),\n)\n\nCredits: I got this tip from the screencast 'Django From the Ground Up'.\n"} {"input": "How to do relative imports in Python? Imagine this directory structure:\napp/\n __init__.py\n sub1/\n __init__.py\n mod1.py\n sub2/\n __init__.py\n mod2.py\n\nI'm coding mod1, and I need to import something from mod2. How should I do it? \nI tried from ..sub2 import mod2 but I'm getting an \"Attempted relative import in non-package\".\nI googled around but found only \"sys.path manipulation\" hacks. Isn't there a clean way? \n\nEdit: all my __init__.py's are currently empty \nEdit2: I'm trying to do this because sub2 contains classes that are shared across sub packages (sub1, subX, etc.).\nEdit3: The behaviour I'm looking for is the same as described in PEP 366 (thanks John B)\n", "output": "Everyone seems to want to tell you what you should be doing rather than just answering the question.\nThe problem is that you're running the module as '__main__' by passing the mod1.py as an argument to the interpreter.\nFrom PEP 328:\n\nRelative imports use a module's __name__ attribute to determine that module's position in the package hierarchy. If the module's name does not contain any package information (e.g. it is set to '__main__') then relative imports are resolved as if the module were a top level module, regardless of where the module is actually located on the file system.\n\nIn Python 2.6, they're adding the ability to reference modules relative to the main module. PEP 366 describes the change.\nUpdate: According to Nick Coghlan, the recommended alternative is to run the module inside the package using the -m switch.\n"} {"input": "Python try-else What is the intended use of the optional else clause of the try statement?\n", "output": "The statements in the else block are executed if execution falls off the bottom of the try - if there was no exception. Honestly, I've never found a need.\nHowever, Handling Exceptions notes:\n\nThe use of the else clause is better\n than adding additional code to the try\n clause because it avoids accidentally\n catching an exception that wasn\u00e2\u0080\u0099t\n raised by the code being protected by\n the try ... except statement.\n\nSo, if you have a method that could, for example, throw an IOError, and you want to catch exceptions it raises, but there's something else you want to do if the first operation succeeds, and you don't want to catch an IOError from that operation, you might write something like this:\n try:\n operation_that_can_throw_ioerror()\n except IOError:\n handle_the_exception_somehow()\n else:\n # we don't want to catch the IOError if it's raised\n another_operation_that_can_throw_ioerror()\n finally:\n something_we_always_need_to_do()\n\nIf you just put another_operation_that_can_throw_ioerror() after operation_that_can_throw_ioerror, the except would catch the second call's errors. And if you put it after the whole try block, it'll always be run, and not until after the finally. The else lets you make sure\n\nthe second operation's only run if there's no exception,\nit's run before the finally block, and\nany IOErrors it raises aren't caught here\n\n"} {"input": "How to get an absolute file path in Python Given a path such as \"mydir/myfile.txt\", how do I find the absolute filepath relative to the current working directory in Python? E.g. on Windows, I might end up with:\n\"C:/example/cwd/mydir/myfile.txt\"\n\n", "output": ">>> import os\n>>> os.path.abspath(\"mydir/myfile.txt\")\n\n"} {"input": "Proper indentation for Python multiline strings What is the proper indentation for Python multiline strings within a function?\n def method():\n string = \"\"\"line one\nline two\nline three\"\"\"\n\nor\n def method():\n string = \"\"\"line one\n line two\n line three\"\"\"\n\nor something else?\nIt looks kind of weird to have the string hanging outside the function in the first example.\n", "output": "You probably want to line up with the \"\"\"\ndef foo():\n string = \"\"\"line one\n line two\n line three\"\"\"\n\nSince the newlines and spaces are included in the string itself, you will have to postprocess it. If you don't want to do that and you have a whole lot of text, you might want to store it separately in a text file. If a text file does not work well for your application and you don't want to postprocess, I'd probably go with\ndef foo():\n string = (\"this is an \"\n \"implicitly joined \"\n \"string\")\n\nIf you want to postprocess a multiline string to trim out the parts you don't need, you should consider the textwrap module or the technique for postprocessing docstrings presented in PEP 257:\ndef trim(docstring):\n if not docstring:\n return ''\n # Convert tabs to spaces (following the normal Python rules)\n # and split into a list of lines:\n lines = docstring.expandtabs().splitlines()\n # Determine minimum indentation (first line doesn't count):\n indent = sys.maxint\n for line in lines[1:]:\n stripped = line.lstrip()\n if stripped:\n indent = min(indent, len(line) - len(stripped))\n # Remove indentation (first line is special):\n trimmed = [lines[0].strip()]\n if indent < sys.maxint:\n for line in lines[1:]:\n trimmed.append(line[indent:].rstrip())\n # Strip off trailing and leading blank lines:\n while trimmed and not trimmed[-1]:\n trimmed.pop()\n while trimmed and not trimmed[0]:\n trimmed.pop(0)\n # Return a single string:\n return '\\n'.join(trimmed)\n\n"} {"input": "Get Last Day of the Month in Python Is there a way using Python's standard library to easily determine (i.e. one function call) the last day of a given month?\nIf the standard library doesn't support that, does the dateutil package support this?\n", "output": "I didn't notice this earlier when I was looking at the documentation for the calendar module, but a method called monthrange provides this information:\n\nmonthrange(year, month)\n \u00a0\u00a0\u00a0\u00a0Returns weekday of first day of the month and number of days in month, for the specified year and month. \n\n>>> import calendar\n>>> calendar.monthrange(2002,1)\n(1, 31)\n>>> calendar.monthrange(2008,2)\n(4, 29)\n>>> calendar.monthrange(2100,2)\n(0, 28)\n\nso:\ncalendar.monthrange(year, month)[1]\n\nseems like the simplest way to go.\nJust to be clear, monthrange supports leap years as well:\n>>> from calendar import monthrange\n>>> monthrange(2012, 2)\n(2, 29)\n\nMy previous answer still works, but is clearly suboptimal.\n"} {"input": "How to get a function name as a string in Python? In Python, how do I get a function name as a string without calling the function?\ndef my_function():\n pass\n\nprint get_function_name_as_string(my_function) # my_function is not in quotes\n\nshould output \"my_function\".\nIs this available in python? If not, any idea how to write get_function_name_as_string in Python?\n", "output": "my_function.__name__\n\nUsing __name__ is the preferred method as it applies uniformly. Unlike func_name, it works on built-in functions as well:\n>>> import time\n>>> time.time.func_name\nTraceback (most recent call last):\n File \"\", line 1, in ?\nAttributeError: 'builtin_function_or_method' object has no attribute 'func_name'\n>>> time.time.__name__ \n'time'\n\nAlso the double underscores indicate to the reader this is a special attribute. As a bonus, classes and modules have a __name__ attribute too, so you only have remember one special name.\n"} {"input": "Usage of __slots__? What is the purpose of __slots__ in Python \u00e2\u0080\u0094 especially with respect to when would I want to use it and when not?\n", "output": "Quoting Jacob Hallen:\n\nThe proper use of __slots__ is to save space in objects. Instead of having\n a dynamic dict that allows adding attributes to objects at anytime,\n there is a static structure which does not allow additions after creation.\n This saves the overhead of one dict for every object that uses slots. While this is sometimes a useful optimization, it would be completely\n unnecessary if the Python interpreter was dynamic enough so that it would\n only require the dict when there actually were additions to the object.\nUnfortunately there is a side effect to slots. They change the behavior of\n the objects that have slots in a way that can be abused by control freaks\n and static typing weenies. This is bad, because the control freaks should\n be abusing the metaclasses and the static typing weenies should be abusing\n decorators, since in Python, there should be only one obvious way of doing something.\nMaking CPython smart enough to handle saving space without __slots__ is a major\n undertaking, which is probably why it is not on the list of changes for P3k (yet).\n\n"} {"input": "How to change a string into uppercase I have problem in changing a string into uppercase with Python. In my research, I got string.ascii_uppercase but it doesn't work.\nThe following code:\n >>s = 'sdsd'\n >>s.ascii_uppercase\n\nGives this error message:\nTraceback (most recent call last):\n File \"\", line 1, in \nAttributeError: 'str' object has no attribute 'ascii_uppercase'\n\nMy question is: how can I convert a string into uppercase in Python?\n", "output": ">>> s = 'sdsd'\n>>> s.upper()\n'SDSD'\n\nSee String Methods.\n"} {"input": "pg_config executable not found I am having trouble installing psycopg2. I get the following error when I try to pip install psycopg2:\nError: pg_config executable not found.\n\nPlease add the directory containing pg_config to the PATH\n\nor specify the full executable path with the option:\n\n\n\n python setup.py build_ext --pg-config /path/to/pg_config build ...\n\n\n\nor with the pg_config option in 'setup.cfg'.\n\n----------------------------------------\nCommand python setup.py egg_info failed with error code 1 in /tmp/pip-build/psycopg2\n\nBut the problem is pg_config is actually in my PATH; it runs without any problem:\n$ which pg_config\n/usr/pgsql-9.1/bin/pg_config\n\nI tried adding the pg_config path to the setup.cfg file and building it using the source files I downloaded from their website (http://initd.org/psycopg/) and I get the following error message!\nError: Unable to find 'pg_config' file in '/usr/pgsql-9.1/bin/'\n\nBut it is actually THERE!!!\nI am baffled by these errors. Can anyone help please?\nBy the way, I sudo all the commands. Also I am on RHEL 5.5.\n", "output": "pg_config is in postgresql-devel (libpq-dev in Debian/Ubuntu, libpq-devel on Cygwin/Babun.)\n"} {"input": "Difference between abstract class and interface in Python What is the difference between abstract class and interface in Python?\n", "output": "What you'll see sometimes is the following:\nclass Abstract1( object ):\n \"\"\"Some description that tells you it's abstract,\n often listing the methods you're expected to supply.\"\"\"\n def aMethod( self ):\n raise NotImplementedError( \"Should have implemented this\" )\n\nBecause Python doesn't have (and doesn't need) a formal Interface contract, the Java-style distinction between abstraction and interface doesn't exist. If someone goes through the effort to define a formal interface, it will also be an abstract class. The only differences would be in the stated intent in the docstring. \nAnd the difference between abstract and interface is a hairsplitting thing when you have duck typing.\nJava uses interfaces because it doesn't have multiple inheritance.\nBecause Python has multiple inheritance, you may also see something like this\nclass SomeAbstraction( object ):\n pass # lots of stuff - but missing something\n\nclass Mixin1( object ):\n def something( self ):\n pass # one implementation\n\nclass Mixin2( object ):\n def something( self ):\n pass # another\n\nclass Concrete1( SomeAbstraction, Mixin1 ):\n pass\n\nclass Concrete2( SomeAbstraction, Mixin2 ):\n pass\n\nThis uses a kind of abstract superclass with mixins to create concrete subclasses that are disjoint.\n"} {"input": "Getting key with maximum value in dictionary? I have a dictionary: keys are strings, values are integers.\nExample: \nstats = {'a':1000, 'b':3000, 'c': 100}\n\nI'd like to get 'b' as an answer, since it's the key with a higher value.\nI did the following, using an intermediate list with reversed key-value tuples:\ninverse = [(value, key) for key, value in stats.items()]\nprint max(inverse)[1]\n\nIs that one the better (or even more elegant) approach?\n", "output": "max(stats, key=stats.get)\n\n"} {"input": "How to define two-dimensional array in python I want to define a two-dimensional array without an initialized length like this :\nMatrix = [][]\n\nbut it does not work...\nI've tried the code below, but it is wrong too:\nMatrix = [5][5]\n\nError:\nTraceback ...\n\nIndexError: list index out of range\n\nWhat is my mistake?\n", "output": "You're technically trying to index an uninitialized array. You have to first initialize the outer list with lists before adding items; Python calls this\n\"list comprehension\".\n# Creates a list containing 5 lists, each of 8 items, all set to 0\nw, h = 8, 5. \nMatrix = [[0 for x in range(w)] for y in range(h)] \n\nYou can now add items to the list:\nMatrix[0][0] = 1\nMatrix[6][0] = 3 # error! range... \nMatrix[0][6] = 3 # valid\n\nprint Matrix[0][0] # prints 1\nx, y = 0, 6 \nprint Matrix[x][y] # prints 3; be careful with indexing! \n\nAlthough you can name them as you wish, I look at it this way to avoid some confusion that could arise with the indexing, if you use \"x\" for both the inner and outer lists, and want a non-square Matrix.\n"} {"input": "How to make IPython notebook matplotlib plot inline I am trying to use IPython notebook on MacOS X with Python 2.7.2 and IPython 1.1.0.\nI cannot get matplotlib graphics to show up inline.\nimport matplotlib\nimport numpy as np\nimport matplotlib.pyplot as plt\n%matplotlib inline \n\nI have also tried %pylab inline and the ipython command line arguments --pylab=inline but this makes no difference.\nx = np.linspace(0, 3*np.pi, 500)\nplt.plot(x, np.sin(x**2))\nplt.title('A simple chirp')\nplt.show()\n\nInstead of inline graphics, I get this:\n\n\nAnd matplotlib.get_backend() shows that I have the 'module://IPython.kernel.zmq.pylab.backend_inline' backend.\n", "output": "I used %matplotlib inline in the first cell of the notebook and it works. I think you should try:\n%matplotlib inline\n\nimport matplotlib\nimport numpy as np\nimport matplotlib.pyplot as plt\n\nYou can also always start all your IPython kernels in inline mode by default by setting the following config options in your config files:\nc.IPKernelApp.matplotlib=\n Default: None\n Choices: ['auto', 'gtk', 'gtk3', 'inline', 'nbagg', 'notebook', 'osx', 'qt', 'qt4', 'qt5', 'tk', 'wx']\n Configure matplotlib for interactive use with the default matplotlib backend.\n\n"} {"input": "How to check for NaN in python? float('nan') results in a thingy simply called nan. But how do I check for it? Should be very easy, but i cannot find it.\n", "output": "math.isnan()\n\nChecks if the float x is a NaN (not a number). NaNs are part of the IEEE 754 standards. Operation like but not limited to inf * 0, inf / inf or any operation involving a NaN, e.g. nan * 1, return a NaN.\nNew in version 2.6.\n\n>>> import math\n>>> x=float('nan')\n>>> math.isnan(x)\nTrue\n>>>\n\n"} {"input": "Is there a difference between `==` and `is` in Python? My Google-fu has failed me.\nIn Python, are the following two tests for equality equivalent (ha!)?\nn = 5\n# Test one.\nif n == 5:\n print 'Yay!'\n\n# Test two.\nif n is 5:\n print 'Yay!'\n\nDoes this hold true for objects where you would be comparing instances (a list say)?\nOkay, so this kind of answers my question:\nL = []\nL.append(1)\nif L == [1]:\n print 'Yay!'\n# Holds true, but...\n\nif L is [1]:\n print 'Yay!'\n# Doesn't.\n\nSo == tests value where is tests to see if they are the same object?\n", "output": "is will return True if two variables point to the same object, == if the objects referred to by the variables are equal.\n>>> a = [1, 2, 3]\n>>> b = a\n>>> b is a \nTrue\n>>> b == a\nTrue\n>>> b = a[:]\n>>> b is a\nFalse\n>>> b == a\nTrue\n\nIn your case, the second test only works because Python caches small integer objects, which is an implementation detail. For larger integers, this does not work:\n>>> 1000 is 10**3\nFalse\n>>> 1000 == 10**3\nTrue\n\nThe same holds true for string literals:\n>>> \"a\" is \"a\"\nTrue\n>>> \"aa\" is \"a\" * 2\nTrue\n>>> x = \"a\"\n>>> \"aa\" is x * 2\nFalse\n>>> \"aa\" is intern(x*2)\nTrue\n\nPlease see this question as well.\n"} {"input": "Django - Set Up A Scheduled Job? I've been working on a web app using Django, and I'm curious if there is a way to schedule a job to run periodically. \nBasically I just want to run through the database and make some calculations/updates on an automatic, regular basis, but I can't seem to find any documentation on doing this.\nDoes anyone know how to set this up?\nTo clarify: I know I can set up a cron job to do this, but I'm curious if there is some feature in Django that provides this functionality. I'd like people to be able to deploy this app themselves without having to do much config (preferably zero).\nI've considered triggering these actions \"retroactively\" by simply checking if a job should have been run since the last time a request was sent to the site, but I'm hoping for something a bit cleaner.\n", "output": "One solution that I have employed is to do this:\n1) Create a custom management command, e.g. \npython manage.py my_cool_command\n\n2) Use cron (on Linux) or at (on Windows) to run my command at the required times.\nThis is a simple solution that doesn't require installing a heavy AMQP stack. However there are nice advantages to using something like Celery, mentioned in the other answers. In particular, with Celery it is nice to not have to spread your application logic out into crontab files. However the cron solution works quite nicely for a small to medium sized application and where you don't want a lot of external dependencies.\nEDIT:\nIn later version of windows the at command is deprecated for Windows 8, Server 2012 and above. You can use schtasks.exe for same use.\n"} {"input": "if/else in Python's list comprehension? How can I do the following in Python?\nrow = [unicode(x.strip()) for x in row if x is not None else '']\n\nEssentially:\n\nreplace all the Nones with empty strings, and then\ncarry out a function.\n\n", "output": "You can totally do that, it's just an ordering issue:\n[ unicode(x.strip()) if x is not None else '' for x in row ]\n\nNote that this actually uses a different language construct, a conditional expression, which itself is not part of the comprehension syntax, while the if after the for\u00e2\u0080\u00a6in is part of list comprehensions and used to filter elements from the source iterable.\n"} {"input": "Creating a singleton in Python This question is not for the discussion of whether or not the singleton design pattern is desirable, is an anti-pattern, or for any religious wars, but to discuss how this pattern is best implemented in Python in such a way that is most pythonic. In this instance I define 'most pythonic' to mean that it follows the 'principle of least astonishment'.\nI have multiple classes which would become singletons (my use-case is for a logger, but this is not important). I do not wish to clutter several classes with added gumph when I can simply inherit or decorate.\nBest methods:\n\nMethod 1: A decorator\ndef singleton(class_):\n instances = {}\n def getinstance(*args, **kwargs):\n if class_ not in instances:\n instances[class_] = class_(*args, **kwargs)\n return instances[class_]\n return getinstance\n\n@singleton\nclass MyClass(BaseClass):\n pass\n\nPros\n\nDecorators are additive in a way that is often more intuitive than multiple inheritance.\n\nCons\n\nWhile objects created using MyClass() would be true singleton objects, MyClass itself is a a function, not a class, so you cannot call class methods from it. Also for m = MyClass(); n = MyClass(); o = type(n)(); then m == n && m != o && n != o\n\n\nMethod 2: A base class\nclass Singleton(object):\n _instance = None\n def __new__(class_, *args, **kwargs):\n if not isinstance(class_._instance, class_):\n class_._instance = object.__new__(class_, *args, **kwargs)\n return class_._instance\n\nclass MyClass(Singleton, BaseClass):\n pass\n\nPros\n\nIt's a true class\n\nCons\n\nMultiple inheritance - eugh! __new__ could be overwritten during inheritance from a second base class? One has to think more than is necessary.\n\n\nMethod 3: A metaclass\nclass Singleton(type):\n _instances = {}\n def __call__(cls, *args, **kwargs):\n if cls not in cls._instances:\n cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)\n return cls._instances[cls]\n\n#Python2\nclass MyClass(BaseClass):\n __metaclass__ = Singleton\n\n#Python3\nclass MyClass(BaseClass, metaclass=Singleton):\n pass\n\nPros\n\nIt's a true class\nAuto-magically covers inheritance\nUses __metaclass__ for its proper purpose (and made me aware of it)\n\nCons\n\nAre there any?\n\n\nMethod 4: decorator returning a class with the same name\ndef singleton(class_):\n class class_w(class_):\n _instance = None\n def __new__(class_, *args, **kwargs):\n if class_w._instance is None:\n class_w._instance = super(class_w,\n class_).__new__(class_,\n *args,\n **kwargs)\n class_w._instance._sealed = False\n return class_w._instance\n def __init__(self, *args, **kwargs):\n if self._sealed:\n return\n super(class_w, self).__init__(*args, **kwargs)\n self._sealed = True\n class_w.__name__ = class_.__name__\n return class_w\n\n@singleton\nclass MyClass(BaseClass):\n pass\n\nPros\n\nIt's a true class\nAuto-magically covers inheritance\n\nCons\n\nIs there not an overhead for creating each new class? Here we are creating two classes for each class we wish to make a singleton. While this is fine in my case, I worry that this might not scale. Of course there is a matter of debate as to whether it aught to be too easy to scale this pattern...\nWhat is the point of the _sealed attribute\nCan't call methods of the same name on base classes using super() because they will recurse. This means you can't customize __new__ and can't subclass a class that needs you to call up to __init__.\n\n", "output": "Use a Metaclass\nI would recommend Method #2, but you're better off using a metaclass than a base class. Here is a sample implementation:\nclass Singleton(type):\n _instances = {}\n def __call__(cls, *args, **kwargs):\n if cls not in cls._instances:\n cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)\n return cls._instances[cls]\n\nclass Logger(object):\n __metaclass__ = Singleton\n\nOr in Python3\nclass Logger(metaclass=Singleton):\n pass\n\nIf you want to run __init__ every time the class is called, add\n else:\n cls._instances[cls].__init__(*args, **kwargs)\n\nto the if statement in Singleton.__call__.\nA few words about metaclasses. A metaclass is the class of a class; that is, a class is an instance of its metaclass. You find the metaclass of an object in Python with type(obj). Normal new-style classes are of type type. Logger in the code above will be of type class 'your_module.Singleton', just as the (only) instance of Logger will be of type class 'your_module.Logger'. When you call logger with Logger(), Python first asks the metaclass of Logger, Singleton, what to do, allowing instance creation to be pre-empted. This process is the same as Python asking a class what to do by calling __getattr__ when you reference one of it's attributes by doing myclass.attribute.\nA metaclass essentially decides what the definition of a class means and how to implement that definition. See for example http://code.activestate.com/recipes/498149/, which essentially recreates C-style structs in Python using metaclasses. The thread What are your (concrete) use-cases for metaclasses in Python? also provides some examples, they generally seem to be related to declarative programming, especially as used in ORMs.\nIn this situation, if you use your Method #2, and a subclass defines a __new__ method, it will be executed every time you call SubClassOfSingleton() -- because it is responsible for calling the method that returns the stored instance. With a metaclass, it will only be called once, when the only instance is created. You want to customize what it means to call the class, which is decided by it's type.\nIn general, it makes sense to use a metaclass to implement a singleton. A singleton is special because is created only once, and a metaclass is the way you customize the creation of a class. Using a metaclass gives you more control in case you need to customize the singleton class definitions in other ways.\nYour singletons won't need multiple inheritance (because the metaclass is not a base class), but for subclasses of the created class that use multiple inheritance, you need to make sure the singleton class is the first / leftmost one with a metaclass that redefines __call__ This is very unlikely to be an issue. The instance dict is not in the instance's namespace so it won't accidentally overwrite it.\nYou will also hear that the singleton pattern violates the \"Single Responsibility Principle\" -- each class should do only one thing. That way you don't have to worry about messing up one thing the code does if you need to change another, because they are separate and encapsulated. The metaclass implementation passes this test. The metaclass is responsible for enforcing the pattern and the created class and subclasses need not be aware that they are singletons. Method #1 fails this test, as you noted with \"MyClass itself is a a function, not a class, so you cannot call class methods from it.\"\nPython 2 and 3 Compatible Version\nWriting something that works in both Python2 and 3 requires using a slightly more complicated scheme. Since metaclasses are usually subclasses of type type, it's possible to use one to dynamically create an intermediary base class at run time with it as its metaclass and then use that as the baseclass of the public Singleton base class. It's harder to explain than to do, as illustrated next:\n# works in Python 2 & 3\nclass _Singleton(type):\n \"\"\" A metaclass that creates a Singleton base class when called. \"\"\"\n _instances = {}\n def __call__(cls, *args, **kwargs):\n if cls not in cls._instances:\n cls._instances[cls] = super(_Singleton, cls).__call__(*args, **kwargs)\n return cls._instances[cls]\n\nclass Singleton(_Singleton('SingletonMeta', (object,), {})): pass\n\nclass Logger(Singleton):\n pass\n\nAn ironic aspect of this approach is that it's using subclassing to implement a metaclass. One possible advantage is that, unlike with a pure metaclass, isinstance(inst, Singleton) will return True.\nCorrections\nOn another topic, you've probably already noticed this, but the base class implementation in your original post is wrong. _instances needs to be referenced on the class, you need to use super() or you're recursing, and __new__ is actually a static method that you have to pass the class to, not a class method, as the actual class hasn't been created yet when it is called. All of these things will be true for a metaclass implementation as well.\nclass Singleton(object):\n _instances = {}\n def __new__(class_, *args, **kwargs):\n if class_ not in class_._instances:\n class_._instances[class_] = super(Singleton, class_).__new__(class_, *args, **kwargs)\n return class_._instances[class_]\n\nclass MyClass(Singleton):\n pass\n\nc = MyClass()\n\nDecorator Returning A Class\nI originally was writing a comment but it was too long, so I'll add this here. Method #4 is better than the other decorator version, but it's more code than needed for a singleton, and it's not as clear what it does. \nThe main problems stem from the class being it's own base class. First, isn't it weird to have a class be a subclass of a nearly identical class with the same name that exists only in its __class__ attribute? This also means that you can't define any methods that call the method of the same name on their base class with super() because they will recurse. This means your class can't customize __new__, and can't derive from any classes that need __init__ called on them.\nWhen to use the singleton pattern\nYour use case is one of the better examples of wanting to use a singleton. You say in one of the comments \"To me logging has always seemed a natural candidate for Singletons.\" You're absolutely right.\nWhen people say singletons are bad, the most common reason is they are implicit shared state. While with global variables and top-level module imports are explicit shared state, other objects that are passed around are generally instantiated. This is a good point, with two exceptions.\nThe first, and one that gets mentioned in various places, is when the singletons are constant. Use of global constants, especially enums, is widely accepted, and considered sane because no matter what, none of the users can mess them up for any other user. This is equally true for a constant singleton.\nThe second exception, which get mentioned less, is the opposite -- when the singleton is only a data sink, not a data source (directly or indirectly). This is why loggers feel like a \"natural\" use for singletons. As the various users are not changing the loggers in ways other users will care about, there is not really shared state. This negates the primary argument against the singleton pattern, and makes them a reasonable choice because of their ease of use for the task.\nHere is a quote from http://googletesting.blogspot.com/2008/08/root-cause-of-singletons.html:\n\nNow, there is one kind of Singleton which is OK. That is a singleton where all of the reachable objects are immutable. If all objects are immutable than Singleton has no global state, as everything is constant. But it is so easy to turn this kind of singleton into mutable one, it is very slippery slope. Therefore, I am against these Singletons too, not because they are bad, but because it is very easy for them to go bad. (As a side note Java enumeration are just these kind of singletons. As long as you don't put state into your enumeration you are OK, so please don't.)\nThe other kind of Singletons, which are semi-acceptable are those which don't effect the execution of your code, They have no \"side effects\". Logging is perfect example. It is loaded with Singletons and global state. It is acceptable (as in it will not hurt you) because your application does not behave any different whether or not a given logger is enabled. The information here flows one way: From your application into the logger. Even thought loggers are global state since no information flows from loggers into your application, loggers are acceptable. You should still inject your logger if you want your test to assert that something is getting logged, but in general Loggers are not harmful despite being full of state.\n\n"} {"input": "How to detect a Christmas Tree? Which image processing techniques could be used to implement an application that detects the christmas trees displayed in the following images?\n\n\n\n\n\n\nI'm searching for solutions that are going to work on all these images. Therefore, approaches that require training haar cascade classifiers or template matching are not very interesting.\nI'm looking for something that can be written in any programming language, as long as it uses only Open Source technologies. The solution must be tested with the images that are shared on this question. There are 6 input images and the answer should display the results of processing each of them. Finally, for each output image there must be red lines draw to surround the detected tree.\nHow would you go about programmatically detecting the trees in these images?\n", "output": "EDIT NOTE: I edited this post to (i) process each tree image individually, as requested in the requirements, (ii) to consider both object brightness and shape in order to improve the quality of the result. \n\nBelow is presented an approach that takes in consideration the object brightness and shape. In other words, it seeks for objects with triangle-like shape and with significant brightness. It was implemented in Java, using Marvin image processing framework.\nThe first step is the color thresholding. The objective here is to focus the analysis on objects with significant brightness. \noutput images:\n\n\n\n\n\n\nsource code:\npublic class ChristmasTree {\n\nprivate MarvinImagePlugin fill = MarvinPluginLoader.loadImagePlugin(\"org.marvinproject.image.fill.boundaryFill\");\nprivate MarvinImagePlugin threshold = MarvinPluginLoader.loadImagePlugin(\"org.marvinproject.image.color.thresholding\");\nprivate MarvinImagePlugin invert = MarvinPluginLoader.loadImagePlugin(\"org.marvinproject.image.color.invert\");\nprivate MarvinImagePlugin dilation = MarvinPluginLoader.loadImagePlugin(\"org.marvinproject.image.morphological.dilation\");\n\npublic ChristmasTree(){\n MarvinImage tree;\n\n // Iterate each image\n for(int i=1; i<=6; i++){\n tree = MarvinImageIO.loadImage(\"./res/trees/tree\"+i+\".png\");\n\n // 1. Threshold\n threshold.setAttribute(\"threshold\", 200);\n threshold.process(tree.clone(), tree);\n }\n}\npublic static void main(String[] args) {\n new ChristmasTree();\n}\n}\n\nIn the second step, the brightest points in the image are dilated in order to form shapes. The result of this process is the probable shape of the objects with significant brightness. Applying flood fill segmentation, disconnected shapes are detected.\noutput images:\n\n\n\n\n\n\nsource code:\npublic class ChristmasTree {\n\nprivate MarvinImagePlugin fill = MarvinPluginLoader.loadImagePlugin(\"org.marvinproject.image.fill.boundaryFill\");\nprivate MarvinImagePlugin threshold = MarvinPluginLoader.loadImagePlugin(\"org.marvinproject.image.color.thresholding\");\nprivate MarvinImagePlugin invert = MarvinPluginLoader.loadImagePlugin(\"org.marvinproject.image.color.invert\");\nprivate MarvinImagePlugin dilation = MarvinPluginLoader.loadImagePlugin(\"org.marvinproject.image.morphological.dilation\");\n\npublic ChristmasTree(){\n MarvinImage tree;\n\n // Iterate each image\n for(int i=1; i<=6; i++){\n tree = MarvinImageIO.loadImage(\"./res/trees/tree\"+i+\".png\");\n\n // 1. Threshold\n threshold.setAttribute(\"threshold\", 200);\n threshold.process(tree.clone(), tree);\n\n // 2. Dilate\n invert.process(tree.clone(), tree);\n tree = MarvinColorModelConverter.rgbToBinary(tree, 127);\n MarvinImageIO.saveImage(tree, \"./res/trees/new/tree_\"+i+\"threshold.png\");\n dilation.setAttribute(\"matrix\", MarvinMath.getTrueMatrix(50, 50));\n dilation.process(tree.clone(), tree);\n MarvinImageIO.saveImage(tree, \"./res/trees/new/tree_\"+1+\"_dilation.png\");\n tree = MarvinColorModelConverter.binaryToRgb(tree);\n\n // 3. Segment shapes\n MarvinImage trees2 = tree.clone();\n fill(tree, trees2);\n MarvinImageIO.saveImage(trees2, \"./res/trees/new/tree_\"+i+\"_fill.png\");\n}\n\nprivate void fill(MarvinImage imageIn, MarvinImage imageOut){\n boolean found;\n int color= 0xFFFF0000;\n\n while(true){\n found=false;\n\n Outerloop:\n for(int y=0; y> 16;\n int green = (color & 0x0000FF00) >> 8;\n int blue = (color & 0x000000FF);\n\n if(red <= green && red <= blue){\n red+=5;\n }\n else if(green <= red && green <= blue){\n green+=5;\n }\n else{\n blue+=5;\n }\n\n return 0xFF000000 + (red << 16) + (green << 8) + blue;\n}\n\npublic static void main(String[] args) {\n new ChristmasTree();\n}\n}\n\nAs shown in the output image, multiple shapes was detected. In this problem, there a just a few bright points in the images. However, this approach was implemented to deal with more complex scenarios. \nIn the next step each shape is analyzed. A simple algorithm detects shapes with a pattern similar to a triangle. The algorithm analyze the object shape line by line. If the center of the mass of each shape line is almost the same (given a threshold) and mass increase as y increase, the object has a triangle-like shape. The mass of the shape line is the number of pixels in that line that belongs to the shape. Imagine you slice the object horizontally and analyze each horizontal segment. If they are centralized to each other and the length increase from the first segment to last one in a linear pattern, you probably has an object that resembles a triangle.\nsource code:\nprivate int[] detectTrees(MarvinImage image){\n HashSet analysed = new HashSet();\n boolean found;\n while(true){\n found = false;\n for(int y=0; y xe){\n xe = x;\n }\n }\n }\n mass[y][0] = xs;\n mass[y][3] = xe;\n mass[y][4] = mc; \n }\n\n int validLines=0;\n for(int y=0; y 0 &&\n Math.abs(((mass[y][0]+mass[y][6])/2)-xStart) <= 50 &&\n mass[y][7] >= (mass[yStart][8] + (y-yStart)*0.3) &&\n mass[y][9] <= (mass[yStart][10] + (y-yStart)*1.5)\n )\n {\n validLines++;\n }\n }\n\n if(validLines > 100){\n return true;\n }\n return false;\n}\n\nFinally, the position of each shape similar to a triangle and with significant brightness, in this case a Christmas tree, is highlighted in the original image, as shown below.\nfinal output images:\n\n\n\n\n\n\nfinal source code:\npublic class ChristmasTree {\n\nprivate MarvinImagePlugin fill = MarvinPluginLoader.loadImagePlugin(\"org.marvinproject.image.fill.boundaryFill\");\nprivate MarvinImagePlugin threshold = MarvinPluginLoader.loadImagePlugin(\"org.marvinproject.image.color.thresholding\");\nprivate MarvinImagePlugin invert = MarvinPluginLoader.loadImagePlugin(\"org.marvinproject.image.color.invert\");\nprivate MarvinImagePlugin dilation = MarvinPluginLoader.loadImagePlugin(\"org.marvinproject.image.morphological.dilation\");\n\npublic ChristmasTree(){\n MarvinImage tree;\n\n // Iterate each image\n for(int i=1; i<=6; i++){\n tree = MarvinImageIO.loadImage(\"./res/trees/tree\"+i+\".png\");\n\n // 1. Threshold\n threshold.setAttribute(\"threshold\", 200);\n threshold.process(tree.clone(), tree);\n\n // 2. Dilate\n invert.process(tree.clone(), tree);\n tree = MarvinColorModelConverter.rgbToBinary(tree, 127);\n MarvinImageIO.saveImage(tree, \"./res/trees/new/tree_\"+i+\"threshold.png\");\n dilation.setAttribute(\"matrix\", MarvinMath.getTrueMatrix(50, 50));\n dilation.process(tree.clone(), tree);\n MarvinImageIO.saveImage(tree, \"./res/trees/new/tree_\"+1+\"_dilation.png\");\n tree = MarvinColorModelConverter.binaryToRgb(tree);\n\n // 3. Segment shapes\n MarvinImage trees2 = tree.clone();\n fill(tree, trees2);\n MarvinImageIO.saveImage(trees2, \"./res/trees/new/tree_\"+i+\"_fill.png\");\n\n // 4. Detect tree-like shapes\n int[] rect = detectTrees(trees2);\n\n // 5. Draw the result\n MarvinImage original = MarvinImageIO.loadImage(\"./res/trees/tree\"+i+\".png\");\n drawBoundary(trees2, original, rect);\n MarvinImageIO.saveImage(original, \"./res/trees/new/tree_\"+i+\"_out_2.jpg\");\n }\n}\n\nprivate void drawBoundary(MarvinImage shape, MarvinImage original, int[] rect){\n int yLines[] = new int[6];\n yLines[0] = rect[1];\n yLines[1] = rect[1]+(int)((rect[3]/5));\n yLines[2] = rect[1]+((rect[3]/5)*2);\n yLines[3] = rect[1]+((rect[3]/5)*3);\n yLines[4] = rect[1]+(int)((rect[3]/5)*4);\n yLines[5] = rect[1]+rect[3];\n\n List points = new ArrayList();\n for(int i=0; i= image.getWidth() ? (image.getWidth()-1)-i: x1);\n lx2 = (x2+i >= image.getWidth() ? (image.getWidth()-1)-i: x2);\n ly1 = (y1+i >= image.getHeight() ? (image.getHeight()-1)-i: y1);\n ly2 = (y2+i >= image.getHeight() ? (image.getHeight()-1)-i: y2);\n\n image.drawLine(lx1+i, ly1, lx2+i, ly2, Color.red);\n image.drawLine(lx1, ly1+i, lx2, ly2+i, Color.red);\n }\n}\n\nprivate void fillRect(MarvinImage image, int[] rect, int length){\n for(int i=0; i analysed = new HashSet();\n boolean found;\n while(true){\n found = false;\n for(int y=0; y xe){\n xe = x;\n }\n }\n }\n mass[y][0] = xs;\n mass[y][12] = xe;\n mass[y][13] = mc; \n }\n\n int validLines=0;\n for(int y=0; y 0 &&\n Math.abs(((mass[y][0]+mass[y][15])/2)-xStart) <= 50 &&\n mass[y][16] >= (mass[yStart][17] + (y-yStart)*0.3) &&\n mass[y][18] <= (mass[yStart][19] + (y-yStart)*1.5)\n )\n {\n validLines++;\n }\n }\n\n if(validLines > 100){\n return true;\n }\n return false;\n}\n\nprivate int[] getObjectRect(MarvinImage image, int color){\n int x1=-1;\n int x2=-1;\n int y1=-1;\n int y2=-1;\n\n for(int y=0; y x2){\n x2 = x;\n }\n if(y1 == -1 || y < y1){\n y1 = y;\n }\n if(y2 == -1 || y > y2){\n y2 = y;\n }\n }\n }\n }\n\n return new int[]{x1, y1, (x2-x1), (y2-y1)};\n}\n\nprivate int newColor(int color){\n int red = (color & 0x00FF0000) >> 16;\n int green = (color & 0x0000FF00) >> 8;\n int blue = (color & 0x000000FF);\n\n if(red <= green && red <= blue){\n red+=5;\n }\n else if(green <= red && green <= blue){\n green+=30;\n }\n else{\n blue+=30;\n }\n\n return 0xFF000000 + (red << 16) + (green << 8) + blue;\n}\n\npublic static void main(String[] args) {\n new ChristmasTree();\n}\n}\n\nThe advantage of this approach is the fact it will probably work with images containing other luminous objects since it analyzes the object shape.\nMerry Christmas!\n\nEDIT NOTE 2\nThere is a discussion about the similarity of the output images of this solution and some other ones. In fact, they are very similar. But this approach does not just segment objects. It also analyzes the object shapes in some sense. It can handle multiple luminous objects in the same scene. In fact, the Christmas tree does not need to be the brightest one. I'm just abording it to enrich the discussion. There is a bias in the samples that just looking for the brightest object, you will find the trees. But, does we really want to stop the discussion at this point? At this point, how far the computer is really recognizing an object that resembles a Christmas tree? Let's try to close this gap. \nBelow is presented a result just to elucidate this point:\ninput image\n\noutput\n\n"} {"input": "How to get the filename without the extension from a path in Python? How to get the filename without the extension from a path in Python?\nI found out a method called os.path.basename to get the filename with extension. But even when I import os, I am not able to call it path.basename. Is it possible to call it as directly as basename?\n", "output": "Getting the name of the file without the extension :\nimport os\nprint(os.path.splitext(\"path_to_file\")[0])\n\nAs for your import problem, you solve it this way :\nfrom os.path import basename\n\n# now you can call it directly with basename\nprint(basename(\"/a/b/c.txt\"))\n\n"} {"input": "Python remove all whitespace in a string I want to eliminate all the whitespace from a string, on both ends, and in between words.\nI have this Python code:\ndef my_handle(self):\n sentence = ' hello apple '\n sentence.strip()\n\nBut that only eliminates the whitespace on both sides of the string. How do I remove all whitespace?\n", "output": "If you want to remove leading and ending spaces, use str.strip(): \nsentence = ' hello apple'\nsentence.strip()\n>>> 'hello apple'\n\nIf you want to remove all spaces, use str.replace():\nsentence = ' hello apple'\nsentence.replace(\" \", \"\")\n>>> 'helloapple'\n\nIf you want to remove duplicated spaces, use str.split():\nsentence = ' hello apple'\n\" \".join(sentence.split())\n>>> 'hello apple'\n\n"} {"input": "How to get time of a python program execution? I have a command line program in Python that takes a while to finish. I want to know the exact time it takes to finish running.\nI've looked at the timeit module, but it seems it's only for small snippets of code. I want to time the whole program.\n", "output": "The simplest way in python:\nimport time\nstart_time = time.time()\nmain()\nprint(\"--- %s seconds ---\" % (time.time() - start_time))\n\nThis assumes that your program takes at least a tenth of second to run.\nPrints:\n--- 0.764891862869 seconds ---\n\n"} {"input": "Best practice for Python Assert \nIs there a performance or code maintenance issue with using assert as part of the standard code instead of using it just for debugging purposes?\nIs\nassert x >= 0, 'x is less than zero'\n\nbetter or worse than \nif x < 0:\n raise Exception, 'x is less than zero'\n\nAlso, is there any way to set a business rule like if x < 0 raise error that is always checked without the try/except/finally so, if at anytime throughout the code x is less than 0 an error is raised, like if you set assert x < 0 at the start of a function, anywhere within the function where x becomes less then 0 an exception is raised?\n\n", "output": "Asserts should be used to test conditions that should never happen. The purpose is to crash early in the case of a corrupt program state.\nExceptions should be used for errors that can conceivably happen, and you should almost always create your own Exception classes.\n\nFor example, if you're writing a function to read from a configuration file into a dict, improper formatting in the file should raise a ConfigurationSyntaxError, while you can assert that you're not about to return None.\n\nIn your example, if x is a value set via a user interface or from an external source, an exception is best.\nIf x is only set by your own code in the same program, go with an assertion.\n"} {"input": "How to print number with commas as thousands separators? I am trying to print an integer in Python 2.6.1 with commas as thousands separators. For example, I want to show the number 1234567 as 1,234,567. How would I go about doing this? I have seen many examples on Google, but I am looking for the simplest practical way.\nIt does not need to be locale-specific to decide between periods and commas. I would prefer something as simple as reasonably possible.\n", "output": "I too, prefer the \"simplest practical way\". For >= 2.7:\n\"{:,}\".format(value)\n\nhttp://docs.python.org/library/string.html#format-specification-mini-language\n"} {"input": "Shuffling a list of objects in python I have a list of objects in Python and I want to shuffle them. I thought I could use the random.shuffle method, but this seems to fail when the list is of objects. Is there a method for shuffling object or another way around this?\nimport random\n\nclass a:\n foo = \"bar\"\n\na1 = a()\na2 = a()\nb = [a1,a2]\n\nprint random.shuffle(b)\n\nThis will fail.\n", "output": "random.shuffle should work. Here's an example, where the objects are lists:\nfrom random import shuffle\nx = [[i] for i in range(10)]\nshuffle(x)\n\n# print x gives [[9], [2], [7], [0], [4], [5], [3], [1], [8], [6]]\n# of course your results will vary\n\nNote that shuffle works in place, and returns None.\n"} {"input": "Finding local IP addresses using Python's stdlib How can I find local IP addresses (i.e. 192.168.x.x or 10.0.x.x) in Python platform independently and using only the standard library?\n", "output": "I just found this but it seems a bit hackish, however they say tried it on *nix and I did on windows and it worked.\nimport socket\ns = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)\ns.connect((\"gmail.com\",80))\nprint(s.getsockname()[0])\ns.close()\n\nThis assumes you have an internet access, and that there is no local proxy.\n"} {"input": "What are the differences between json and simplejson Python modules? I have seen many projects using simplejson module instead of json module from the Standard Library. Also, there are many different simplejson modules. Why would use these alternatives, instead of the one in the Standard Library?\n", "output": "json is simplejson, added to the stdlib. But since json was added in 2.6, simplejson has the advantage of working on more Python versions (2.4+). \nsimplejson is also updated more frequently than Python, so if you need (or want) the latest version, it's best to use simplejson itself, if possible.\nA good practice, in my opinion, is to use one or the other as a fallback.\ntry: import simplejson as json\nexcept ImportError: import json\n\n"} {"input": "Importing files from different folder in Python I have the following folder structure.\napplication/app/folder/file.py\nand I want to import some functions from file.py in another Python file which resides in\napplication/app2/some_folder/some_file.py\nI've tried\nfrom application.app.folder.file import func_name\nand some other various attempts but so far I couldn't manage to import properly. How can I do this?\n", "output": "By default, you can't. When importing a file, Python only searches the current directory, the directory that the entry-point script is running from, and sys.path which includes locations such as the package installation directory (it's actually a little more complex than this, but this covers most cases).\nHowever, you can add to the Python path at runtime:\n# some_file.py\nimport sys\nsys.path.insert(0, '/path/to/application/app/folder')\n\nimport file\n\n"} {"input": "How do I write JSON data to a file in Python? I have JSON data stored in the variable data.\nI want to write this to a text file for testing so I don't have to grab the data from the server each time.\nCurrently, I am trying this:\nobj = open('data.txt', 'wb')\nobj.write(data)\nobj.close\n\nAnd am receiving the error: \nTypeError: must be string or buffer, not dict\n\nHow to fix this?\n", "output": "You forgot the actual JSON part - data is a dictionary and not yet JSON-encoded. Write it like this:\nimport json\nwith open('data.txt', 'w') as outfile:\n json.dump(data, outfile)\n\nNote: Works on both 3.x and 2.x .\n"} {"input": "How do I capture SIGINT in Python? I'm working on a python script that starts several processes and database connections. Every now and then I want to kill the script with a Ctrl+C signal, and I'd like to do some cleanup.\nIn Perl I'd do this:\n$SIG{'INT'} = 'exit_gracefully';\n\nsub exit_gracefully {\n print \"Caught ^C \\n\";\n exit (0);\n}\n\nHow do I do the analogue of this in Python?\n", "output": "Register your handler with signal.signal like this:\n#!/usr/bin/env python\nimport signal\nimport sys\ndef signal_handler(signal, frame):\n print('You pressed Ctrl+C!')\n sys.exit(0)\nsignal.signal(signal.SIGINT, signal_handler)\nprint('Press Ctrl+C')\nsignal.pause()\n\nCode adapted from here.\nMore documentation on signal can be found here. \n"} {"input": "Running unittest with typical test directory structure The very common directory structure for even a simple Python module seems to be to separate the unit tests into their own test directory:\nnew_project/\n antigravity/\n antigravity.py\n test/\n test_antigravity.py\n setup.py\n etc.\n\nfor example see this Python project howto.\nMy question is simply What's the usual way of actually running the tests? I suspect this is obvious to everyone except me, but you can't just run python test_antigravity.py from the test directory as its import antigravity will fail as the module is not on the path.\nI know I could modify PYTHONPATH and other search path related tricks, but I can't believe that's the simplest way - it's fine if you're the developer but not realistic to expect your users to use if they just want to check the tests are passing.\nThe other alternative is just to copy the test file into the other directory, but it seems a bit dumb and misses the point of having them in a separate directory to start with.\nSo, if you had just downloaded the source to my new project how would you run the unit tests? I'd prefer an answer that would let me say to my users: \"To run the unit tests do X.\"\n", "output": "The best solution in my opinion is to use the unittest command line interface which will add the directory to the sys.path so you don't have to (done in the TestLoader class).\nFor example for a directory structure like this:\nnew_project\n\u00e2\u0094\u009c\u00e2\u0094\u0080\u00e2\u0094\u0080 antigravity.py\n\u00e2\u0094\u0094\u00e2\u0094\u0080\u00e2\u0094\u0080 test_antigravity.py\n\nYou can just run:\n$ cd new_project\n$ python -m unittest test_antigravity\n\nFor a directory structure like yours:\nnew_project\n\u00e2\u0094\u009c\u00e2\u0094\u0080\u00e2\u0094\u0080 antigravity\n\u00e2\u0094\u0082\u00c2\u00a0\u00c2\u00a0 \u00e2\u0094\u009c\u00e2\u0094\u0080\u00e2\u0094\u0080 __init__.py # make it a package\n\u00e2\u0094\u0082\u00c2\u00a0\u00c2\u00a0 \u00e2\u0094\u0094\u00e2\u0094\u0080\u00e2\u0094\u0080 antigravity.py\n\u00e2\u0094\u0094\u00e2\u0094\u0080\u00e2\u0094\u0080 test\n \u00e2\u0094\u009c\u00e2\u0094\u0080\u00e2\u0094\u0080 __init__.py # also make test a package\n \u00e2\u0094\u0094\u00e2\u0094\u0080\u00e2\u0094\u0080 test_antigravity.py\n\nAnd in the test modules inside the test package, you can import the antigravity package and its modules as usual:\n# import the package\nimport antigravity\n\n# import the antigravity module\nfrom antigravity import antigravity\n\n# or an object inside the antigravity module\nfrom antigravity.antigravity import my_object\n\nRunning a single test module:\nTo run a single test module, in this case test_antigravity.py:\n$ cd new_project\n$ python -m unittest test.test_antigravity\n\nJust reference the test module the same way you import it.\nRunning a single test case or test method:\nAlso you can run a single TestCase or a single test method:\n$ python -m unittest test.test_antigravity.GravityTestCase\n$ python -m unittest test.test_antigravity.GravityTestCase.test_method\n\nRunning all tests:\nYou can also use test discovery which will discover and run all the tests for you, they must be modules or packages named test*.py (can be changed with the -p, --pattern flag):\n$ cd new_project\n$ python -m unittest discover\n\nThis will run all the test*.py modules inside the test package.\n"} {"input": "Where do the Python unit tests go? If you're writing a library, or an app, where do the unit test files go? \nIt's nice to separate the test files from the main app code, but it's awkward to put them into a \"tests\" subdirectory inside of the app root directory, because it makes it harder to import the modules that you'll be testing. \nIs there a best practice here?\n", "output": "For a file module.py, the unit test should normally be called test_module.py, following Pythonic naming conventions.\nThere are several commonly accepted places to put test_module.py:\n\nIn the same directory as module.py.\nIn ../tests/test_module.py (at the same level as the code directory).\nIn tests/test_module.py (one level under the code directory).\n\nI prefer #1 for its simplicity of finding the tests and importing them. Whatever build system you're using can easily be configured to run files starting with test_. Actually, the default unittest pattern used for test discovery is test*.py.\n"} {"input": "Python Create unix timestamp five minutes in the future I have to create an \"Expires\" value 5 minutes in the future, but I have to supply it in UNIX Timestamp format. I have this so far, but it seems like a hack.\ndef expires():\n '''return a UNIX style timestamp representing 5 minutes from now'''\n epoch = datetime.datetime(1970, 1, 1)\n seconds_in_a_day = 60 * 60 * 24\n five_minutes = datetime.timedelta(seconds=5*60)\n five_minutes_from_now = datetime.datetime.now() + five_minutes\n since_epoch = five_minutes_from_now - epoch\n return since_epoch.days * seconds_in_a_day + since_epoch.seconds\n\nIs there a module or function that does the timestamp conversion for me?\n", "output": "Another way is to use calendar.timegm:\nfuture = datetime.datetime.utcnow() + datetime.timedelta(minutes=5)\nreturn calendar.timegm(future.timetuple())\n\nIt's also more portable than %s flag to strftime (which doesn't work on Windows).\n"} {"input": "How do you test that a Python function throws an exception? How does one write a unittest that fails only if a function doesn't throw an expected exception?\n", "output": "Use TestCase.assertRaises (or TestCase.failUnlessRaises) from the unittest module, for example:\nimport mymod\n\nclass MyTestCase(unittest.TestCase):\n def test1(self):\n self.assertRaises(SomeCoolException, mymod.myfunc)\n\n"} {"input": "Disable output buffering Is output buffering enabled by default in Python's interpreter for sys.stdout?\nIf the answer is positive, what are all the ways to disable it?\nSuggestions so far:\n\nUse the -u command line switch\nWrap sys.stdout in an object that flushes after every write\nSet PYTHONUNBUFFERED env var\nsys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)\n\nIs there any other way to set some global flag in sys/sys.stdout programmatically during execution?\n", "output": "From Magnus Lycka answer on a mailing list:\n\nYou can skip buffering for a whole\n python process using \"python -u\" \n (or#!/usr/bin/env python -u etc) or by\n setting the environment variable\n PYTHONUNBUFFERED.\nYou could also replace sys.stdout with\n some other stream like wrapper which\n does a flush after every call.\nclass Unbuffered(object):\n def __init__(self, stream):\n self.stream = stream\n def write(self, data):\n self.stream.write(data)\n self.stream.flush()\n def __getattr__(self, attr):\n return getattr(self.stream, attr)\n\nimport sys\nsys.stdout = Unbuffered(sys.stdout)\nprint 'Hello'\n\n\n"} {"input": "What is setup.py? Can anyone please explain, what is setup.py and how can it be configured or used?\n", "output": "setup.py is a python file, which usually tells you that the module/package you are about to install have been packaged and distributed with Distutils, which is the standard for distributing Python Modules.\nThis allows you to easily install Python packages, often it's enough to write:\npython setup.py install\n\nand the module will install itself.\nhttp://docs.python.org/install/index.html\n"} {"input": "How to print the full traceback without halting the program? I'm writing a program that parses 10 websites, locates data files, saves the files, and then parses them to make data that can be readily used in the NumPy library. There are tons of errors this file encounters through bad links, poorly formed XML, missing entries, and other things I've yet to categorize. I initially made this program to handle errors like this:\ntry:\n do_stuff()\nexcept:\n pass\n\nBut now I want to log errors:\ntry:\n do_stuff()\nexcept Exception, err:\n print Exception, err\n\nNote this is printing to a log file for later review. This usually prints very useless data. What I want is to print the exact same lines printed when the error triggers without the try-except intercepting the exception, but I don't want it to halt my program since it is nested in a series of for loops that I would like to see to completion.\n", "output": "traceback.format_exc() or sys.exc_info() will yield more info if that's what you want.\nimport traceback\nimport sys\n\ntry:\n do_stuff()\nexcept Exception:\n print(traceback.format_exc())\n # or\n print(sys.exc_info()[0])\n\n"} {"input": "How can I open multiple files using \"with open\" in Python? I want to change a couple of files at one time, iff I can write to all of them. I'm wondering if I somehow can combine the multiple open calls with the with statement:\ntry:\n with open('a', 'w') as a and open('b', 'w') as b:\n do_something()\nexcept IOError as e:\n print 'Operation failed: %s' % e.strerror\n\nIf that's not possible, what would an elegant solution to this problem look like?\n", "output": "As of Python 2.7 (or 3.1 respectively) you can write\nwith open('a', 'w') as a, open('b', 'w') as b:\n do_something()\n\nIn earlier versions of Python, you can sometimes use \ncontextlib.nested() to nest context managers. This won't work as expected for opening multiples files, though -- see the linked documentation for details.\n"} {"input": "Python decimal range() step value Is there a way to step between 0 and 1 by 0.1? \nI thought I could do it like the following, but it failed:\nfor i in range(0, 1, 0.1):\n print i\n\nInstead, it says that the step argument cannot be zero, which I did not expect.\n", "output": "You can also use the NumPy library (which isn't part of standard library but is relatively easy to obtain) which has the arange function:\n>>> import numpy as np\n>>> np.arange(0,1,0.1)\narray([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])\n\nas well as the linspace function which lets you have control over what happens at the endpoint (non-trivial for floating point numbers when things won't always divide into the correct number of \"slices\"):\n>>> np.linspace(0,1,11)\narray([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])\n>>> np.linspace(0,1,10,endpoint=False)\narray([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])\n\n"} {"input": "Best way to strip punctuation from a string in Python It seems like there should be a simpler way than:\nimport string\ns = \"string. With. Punctuation?\" # Sample string \nout = s.translate(string.maketrans(\"\",\"\"), string.punctuation)\n\nIs there?\n", "output": "From an efficiency perspective, you're not going to beat translate() - it's performing raw string operations in C with a lookup table - there's not much that will beat that but writing your own C code.\nIf speed isn't a worry, another option though is:\nexclude = set(string.punctuation)\ns = ''.join(ch for ch in s if ch not in exclude)\n\nThis is faster than s.replace with each char, but won't perform as well as non-pure python approaches such as regexes or string.translate, as you can see from the below timings. For this type of problem, doing it at as low a level as possible pays off.\nTiming code:\nimport re, string, timeit\n\ns = \"string. With. Punctuation\"\nexclude = set(string.punctuation)\ntable = string.maketrans(\"\",\"\")\nregex = re.compile('[%s]' % re.escape(string.punctuation))\n\ndef test_set(s):\n return ''.join(ch for ch in s if ch not in exclude)\n\ndef test_re(s): # From Vinko's solution, with fix.\n return regex.sub('', s)\n\ndef test_trans(s):\n return s.translate(table, string.punctuation)\n\ndef test_repl(s): # From S.Lott's solution\n for c in string.punctuation:\n s=s.replace(c,\"\")\n return s\n\nprint \"sets :\",timeit.Timer('f(s)', 'from __main__ import s,test_set as f').timeit(1000000)\nprint \"regex :\",timeit.Timer('f(s)', 'from __main__ import s,test_re as f').timeit(1000000)\nprint \"translate :\",timeit.Timer('f(s)', 'from __main__ import s,test_trans as f').timeit(1000000)\nprint \"replace :\",timeit.Timer('f(s)', 'from __main__ import s,test_repl as f').timeit(1000000)\n\nThis gives the following results:\nsets : 19.8566138744\nregex : 6.86155414581\ntranslate : 2.12455511093\nreplace : 28.4436721802\n\n"} {"input": "Iterating over a string In C++, I could do:\nfor (int i = 0; i < str.length(); ++i)\n std::cout << str[i] << std::endl;\n\nHow do I iterate over a string in Python?\n", "output": "As Johannes pointed out, \nfor c in \"string\":\n #do something with c\n\nYou can iterate pretty much anything in python using the for loop construct, \nfor example, open(\"file.txt\") returns a file object (and opens the file), iterating over it iterates over lines in that file\nfor line in open(filename):\n # do something with line\n\nIf that seems like magic, well it kinda is, but the idea behind it is really simple. \nThere's a simple iterator protocol that can be applied to any kind of object to make the for loop work on it.\nSimply implement an iterator that defines a next() method, and implement an __iter__ method on a class to make it iterable. (the __iter__ of course, should return an iterator object, that is, an object that defines next())\nSee official documentation\n"} {"input": "Dynamic module import in Python I'm writing a Python application that takes as a command as an argument, for example:\n$ python myapp.py command1\n\nI want the application to be extensible, that is, to be able to add new modules that implement new commands without having to change the main application source. The tree looks something like:\nmyapp/\n __init__.py\n commands/\n __init__.py\n command1.py\n command2.py\n foo.py\n bar.py\n\nSo I want the application to find the available command modules at runtime and execute the appropriate one.\nCurrently this is implemented something like:\ncommand = sys.argv[1]\ntry:\n command_module = __import__(\"myapp.commands.%s\" % command, fromlist=[\"myapp.commands\"])\nexcept ImportError:\n # Display error message\n\ncommand_module.run()\n\nThis works just fine, I'm just wondering if there is possibly a more idiomatic way to accomplish what we are doing with this code.\nNote that I specifically don't want to get in to using eggs or extension points. This is not an open-source project and I don't expect there to be \"plugins\". The point is to simplify the main application code and remove the need to modify it each time a new command module is added.\n", "output": "With Python older than 2.7/3.1, that's pretty much how you do it. For newer versions, see importlib.import_module for 2.7+ and for 3.1+.\nYou can use exec if you want to as well.\nNote you can import a list of modules by doing this:\n>>> moduleNames = ['sys', 'os', 're', 'unittest'] \n>>> moduleNames\n['sys', 'os', 're', 'unittest']\n>>> modules = map(__import__, moduleNames)\n\nRipped straight from Dive Into Python.\n"} {"input": "What is a Python egg? I'm new to Python and am just trying to understand how its packages work. Presumably \"eggs\" are some sort of packaging mechanism, but what would be a quick overview of what role they play and maybe some information on why they're useful and how to create them?\n", "output": "Same concept as a .jar file in Java, it is a .zip file with some metadata files renamed .egg, for distributing code as bundles.\nSpecifically: The Internal Structure of Python Eggs\n\nA \"Python egg\" is a logical structure embodying the release of a\n specific version of a Python project, comprising its code, resources,\n and metadata. There are multiple formats that can be used to\n physically encode a Python egg, and others can be developed. However,\n a key principle of Python eggs is that they should be discoverable and\n importable. That is, it should be possible for a Python application to\n easily and efficiently find out what eggs are present on a system, and\n to ensure that the desired eggs' contents are importable.\nThe .egg format is well-suited to distribution and the easy\n uninstallation or upgrades of code, since the project is essentially\n self-contained within a single directory or file, unmingled with any\n other projects' code or resources. It also makes it possible to have\n multiple versions of a project simultaneously installed, such that\n individual programs can select the versions they wish to use.\n\nNote: Egg packaging has been superseded by Wheel packaging.\n"} {"input": "How to get full path of current file's directory in Python? I want to get the current file's directory path.\nI tried: \n>>> os.path.abspath(__file__)\n'C:\\\\python27\\\\test.py'\n\nBut how can I retrieve the directory's path?\nFor example:\n'C:\\\\python27\\\\'\n\n", "output": "If you mean the directory of the script being run:\nimport os\nos.path.dirname(os.path.abspath(__file__))\n\nIf you mean the current working directory:\nimport os\nos.getcwd()\n\nNote that before and after file is two underscores, not just one. \n"} {"input": "Multiprocessing vs Threading Python I am trying to understand the advantages of multiprocessing over threading. I know that multiprocessing gets around the Global Interpreter Lock, but what other advantages are there, and can threading not do the same thing?\n", "output": "Here are some pros/cons I came up with.\nMultiprocessing\nPros\n\nSeparate memory space\nCode is usually straightforward\nTakes advantage of multiple CPUs & cores\nAvoids GIL limitations for cPython\nEliminates most needs for synchronization primitives unless if you use shared memory (instead, it's more of a communication model for IPC)\nChild processes are interruptible/killable\nPython multiprocessing module includes useful abstractions with an interface much like threading.Thread\nA must with cPython for CPU-bound processing\n\nCons\n\nIPC a little more complicated with more overhead (communication model vs. shared memory/objects)\nLarger memory footprint\n\nThreading\nPros\n\nLightweight - low memory footprint\nShared memory - makes access to state from another context easier\nAllows you to easily make responsive UIs\ncPython C extension modules that properly release the GIL will run in parallel\nGreat option for I/O-bound applications\n\nCons\n\ncPython - subject to the GIL\nNot interruptible/killable\nIf not following a command queue/message pump model (using the Queue module), then manual use of synchronization primitives become a necessity (decisions are needed for the granularity of locking)\nCode is usually harder to understand and to get right - the potential for race conditions increases dramatically \n\n"} {"input": "What is the use of \"assert\" in Python? I have been reading some source code and in several places I have seen the usage of assert. \nWhat does it mean exactly? What is its usage?\n", "output": "The assert statement exists in almost every programming language. When you do...\nassert condition\n\n... you're telling the program to test that condition, and trigger an error if the condition is false.\nIn Python, it's roughly equivalent to this:\nif not condition:\n raise AssertionError()\n\nTry it in the Python shell:\n>>> assert True\n>>> assert False\nTraceback (most recent call last):\n File \"\", line 1, in \nAssertionError\n\nAssertions can include an optional message, and you can disable them when you're done debugging. See here for the relevant documentation.\n"} {"input": "Does Python have \u00e2\u0080\u009cprivate\u00e2\u0080\u009d variables in classes? I'm coming from the Java world and reading Bruce Eckels' Python 3 Patterns, Recipes and Idioms. \nWhile reading about classes, it goes on to say that in Python there is no need to declare instance variables. You just use them in the constructor, and boom, they are there. \nSo for example:\nclass Simple:\n def __init__(self1, str):\n print(\"inside the simple constructor\")\n self1.s = str\n def show(self1):\n print(self1.s)\n def showMsg (self, msg):\n print (msg + ':', self.show())\n\nIf that\u00e2\u0080\u0099s true, then any object of class Simple can just change the value of variable s outside of the class. \nFor example: \nif __name__ == \"__main__\":\n x = Simple(\"constructor argument\")\n x.s = \"test15\" # this changes the value\n x.show()\n x.showMsg(\"A message\")\n\nIn Java, we have been taught about public/private/protected variables. Those keywords make sense because at times you want variables in a class to which no one outside the class has access to. \nWhy is that not required in Python?\n", "output": "It's cultural. In Python, you don't write to other classes' instance or class variables. In Java, nothing prevents you from doing the same if you really want to - after all, you can always edit the source of the class itself to achieve the same effect. Python drops that pretence of security and encourages programmers to be responsible. In practice, this works very nicely.\nIf you want to emulate private variables for some reason, you can always use the __ prefix from PEP 8. Python mangles the names of variables like __foo so that they're not easily visible to code outside the class that contains them (although you can get around it if you're determined enough, just like you can get around Java's protections if you work at it).\nBy the same convention, the _ prefix means stay away even if you're not technically prevented from doing so. You don't play around with another class's variables that look like __foo or _bar.\n"} {"input": "What is a \"slug\" in Django? When I read Django code I often see in models what is called a \"slug\". I am not quite sure what this is, but I do know it has something to do with URLs. How and when is this slug-thing supposed to be used? \n(I have read its definition in this glossary.)\n", "output": "It's a way of generating a valid URL, generally using data already obtained. For instance, using the title of an article to generate a URL. I'd advise to generate the slug, using a function, given a title (or other piece of data), rather than setting it manually.\nAn example:\n The 46 Year Old Virgin \n A silly comedy movie \n the-46-year-old-virgin \n\nNow let's pretend that we have a Django model such as:\nclass Article(models.Model):\n title = models.CharField(max_length=100)\n content = models.TextField(max_length=1000)\n slug = models.SlugField(max_length=40)\n\nHow would you reference this object with a URL, with a meaningful name? You could use Article.id so the URL would look like this:\nwww.example.com/article/23\n\nOr, you could reference the title like so:\nwww.example.com/article/The 46 Year Old Virgin\n\nProblem is, spaces aren't valid in URLs, they need to be replaced by %20 which is ugly, making it the following:\nwww.example.com/article/The%2046%20Year%20Old%20Virgin\n\nThat's not solving our meaningful URL. Wouldn't this be better:\nwww.example.com/article/the-46-year-old-virgin\n\nThat's a slug. the-46-year-old-virgin. All letters are downcased and spaces are replaced by hyphens -. See the URL of this very webpage for an example!\n"} {"input": "How can I iterate through two lists in parallel in Python? I have two iterables in Python, and I want to go over them in pairs:\nfoo = (1, 2, 3)\nbar = (4, 5, 6)\n\nfor (f, b) in some_iterator(foo, bar):\n print \"f: \", f, \"; b: \", b\n\nIt should result in:\nf: 1; b: 4\nf: 2; b: 5\nf: 3; b: 6\n\nOne way to do it is to iterate over the indices:\nfor i in xrange(len(foo)):\n print \"f: \", foo[i], \"; b: \", b[i]\n\nBut that seems somewhat unpythonic to me. Is there a better way to do it?\n", "output": "for f, b in zip(foo, bar):\n print(f, b)\n\nzip stops when the shorter of foo or bar stops.\nIn Python 2, zip\nreturns a list of tuples. This is fine when foo and bar are not massive. If\nthey are both massive then forming zip(foo,bar) is an unnecessarily massive\ntemporary variable, and should be replaced by itertools.izip or\nitertools.izip_longest, which returns an iterator instead of a list.\nimport itertools\nfor f,b in itertools.izip(foo,bar):\n print(f,b)\nfor f,b in itertools.izip_longest(foo,bar):\n print(f,b)\n\nizip stops when either foo or bar is exhausted.\nizip_longest stops when both foo and bar are exhausted.\nWhen the shorter iterator(s) are exhausted, izip_longest yields a tuple with None in the position corresponding to that iterator. You can also set a different fillvalue besides None if you wish. See here for the full story.\nIn Python 3, zip\nreturns an iterator of tuples, like itertools.izip in Python2. To get a list\nof tuples, use list(zip(foo, bar)). And to zip until both iterators are\nexhausted, you would use\nitertools.zip_longest.\n\nNote also that zip and its zip-like brethen can accept an arbitrary number of iterables as arguments. For example,\nfor num, cheese, color in zip([1,2,3], ['manchego', 'stilton', 'brie'], \n ['red', 'blue', 'green']):\n print('{} {} {}'.format(num, color, cheese))\n\nprints\n1 red manchego\n2 blue stilton\n3 green brie\n\n"} {"input": "How do I remove the first Item from a Python list? \nPossible Duplicate:\ngood primer for python slice notation \n\nI have the list [0, 1, 2, 3, 4] I'd like to make it into [1, 2, 3, 4]. How do I go about this? \n", "output": "Python List\nlist.pop(index)\n>>> l = [0, 1, 2, 3, 4]\n>>> l.pop(0)\n0\n>>> l\n[1, 2, 3, 4]\n\ndel list[index]\n>>> l = [0, 1, 2, 3, 4]\n>>> del l[0]\n>>> l\n[1, 2, 3, 4]\n\nThese both modify your original list.\nOthers have suggested using slicing:\n\nCopies the list\nCan return a subset\n\nAlso, if you are performing many pop(0), you should look at collections.deque\nfrom collections import deque\n>>> l = deque([0, 1, 2, 3, 4])\n>>> l.popleft()\n0\n>>> l\ndeque([1, 2, 3, 4])\n\n\nProvides higher performance popping from left end of the list\n\n"} {"input": "Python - Intersection of two lists I know how to get an intersection of two flat lists:\nb1 = [1,2,3,4,5,9,11,15]\nb2 = [4,5,6,7,8]\nb3 = [val for val in b1 if val in b2]\n\nor \ndef intersect(a, b):\n return list(set(a) & set(b))\n\nprint intersect(b1, b2)\n\nBut when I have to find intersection for nested lists then my problems starts:\nc1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]\nc2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]\n\nIn the end I would like to receive:\nc3 = [[13,32],[7,13,28],[1,6]]\n\nCan you guys give me a hand with this?\nRelated\n\nFlattening a shallow list in python\n\n", "output": "You don't need to define intersection. It's already a first-class part of set.\n>>> b1 = [1,2,3,4,5,9,11,15]\n>>> b2 = [4,5,6,7,8]\n>>> set(b1).intersection(b2)\nset([4, 5])\n\n"} {"input": "Call a parent class's method from child class in Python? When creating a simple object hierarchy in Python, I'd like to be able to invoke methods of the parent class from a derived class. In Perl and Java, there is a keyword for this (super). In Perl, I might do this:\npackage Foo;\n\nsub frotz {\n return \"Bamf\";\n}\n\npackage Bar;\n@ISA = qw(Foo);\n\nsub frotz {\n my $str = SUPER::frotz();\n return uc($str);\n}\n\nIn python, it appears that I have to name the parent class explicitly from the child.\nIn the example above, I'd have to do something like Foo::frotz(). \nThis doesn't seem right, since this behavior makes it hard to make deep hierarchies. If children need to know what class defined an inherited method, then all sorts of information pain is created. \nIs this an actual limitation in python, a gap in my understanding or both?\n", "output": "Yes, but only with new-style classes. Use the super() function:\nclass Foo(Bar):\n def baz(self, arg):\n return super(Foo, self).baz(arg)\n\n"} {"input": "How do I determine the size of an object in Python? In C, we can find the size of an int, char, etc. I want to know how to get size of objects like a string, integer, etc. in Python.\nRelated question: How many bytes per element are there in a Python list (tuple)?\nI am using an XML file which contains size fields that specify the size of value. I must parse this XML and do my coding. When I want to change the value of a particular field, I will check the size field of that value. Here I want to compare whether the new value that I'm gong to enter is of the same size as in XML. I need to check the size of new value. In case of a string I can say its the length. But in case of int, float, etc. I am confused.\n", "output": "Just use the sys.getsizeof function defined in the sys module.\n\nsys.getsizeof(object[, default]):\nReturn the size of an object in bytes.\n The object can be any type of object.\n All built-in objects will return\n correct results, but this does not\n have to hold true for third-party\n extensions as it is implementation\n specific.\nThe default argument allows to define\n a value which will be returned if the\n object type does not provide means to\n retrieve the size and would cause a\n TypeError.\ngetsizeof calls the object\u00e2\u0080\u0099s\n __sizeof__ method and adds an additional garbage collector overhead\n if the object is managed by the\n garbage collector.\n\nUsage example, in python 3.0:\n>>> import sys\n>>> x = 2\n>>> sys.getsizeof(x)\n14\n>>> sys.getsizeof(sys.getsizeof)\n32\n>>> sys.getsizeof('this')\n38\n>>> sys.getsizeof('this also')\n48\n\nIf you are in python < 2.6 and don't have sys.getsizeof you can use this extensive module instead. Never used it though.\n"} {"input": "How to generate all permutations of a list in Python How do you generate all the permutations of a list in Python, independently of the type of elements in that list?\nFor example:\npermutations([])\n[]\n\npermutations([1])\n[1]\n\npermutations([1, 2])\n[1, 2]\n[2, 1]\n\npermutations([1, 2, 3])\n[1, 2, 3]\n[1, 3, 2]\n[2, 1, 3]\n[2, 3, 1]\n[3, 1, 2]\n[3, 2, 1]\n\nEDIT:\nEliben pointed to a solution that's similar to mine although simpler, so I'm choosing it as the accepted answer, although Python 2.6+ has a builtin solution in the itertools module:\nimport itertools\nitertools.permutations([1, 2, 3])\n\n", "output": "And in Python 2.6 onwards:\nimport itertools\nitertools.permutations([1,2,3])\n\n(returned as a generator. Use list(permutations(l)) to return as a list.)\n"} {"input": "How do I convert datetime to date (in Python)? How do I convert a datetime.datetime object (for example, the return value of datetime.datetime.now()) to a datetime.date object in Python?\n", "output": "Use the date() method:\ndatetime.datetime.now().date()\n\n"} {"input": "How to check what OS am I running on in Python? What do I need to look at to see if I'm on Windows, Unix, etc?\n", "output": ">>> import os\n>>> print os.name\nposix\n>>> import platform\n>>> platform.system()\n'Linux'\n>>> platform.release()\n'2.6.22-15-generic'\n\nSee: platform \u00e2\u0080\u0094 Access to underlying platform\u00e2\u0080\u0099s identifying data\n"} {"input": "Calling C/C++ from python? What would be the quickest way to construct a python binding to a C or C++ library?\n(using windows if this matters)\n", "output": "I like ctypes a lot, swig always tended to give me problems. Also ctypes has the advantage that you don't need to satisfy any compile time dependency on python, and your binding will work on any python that has ctypes, not just the one it was compiled against.\nSuppose you have a simple C++ example class you want to talk to in a file called foo.cpp:\n#include \n\nclass Foo{\n public:\n void bar(){\n std::cout << \"Hello\" << std::endl;\n }\n};\n\nSince ctypes can only talk to C functions, you need to provide those declaring them as extern \"C\"\nextern \"C\" {\n Foo* Foo_new(){ return new Foo(); }\n void Foo_bar(Foo* foo){ foo->bar(); }\n}\n\nNext you have to compile this to a shared library\ng++ -c -fPIC foo.cpp -o foo.o\ng++ -shared -Wl,-soname,libfoo.so -o libfoo.so foo.o\n\nAnd finally you have to write your python wrapper (e.g. in fooWrapper.py)\nfrom ctypes import cdll\nlib = cdll.LoadLibrary('./libfoo.so')\n\nclass Foo(object):\n def __init__(self):\n self.obj = lib.Foo_new()\n\n def bar(self):\n lib.Foo_bar(self.obj)\n\nOnce you have that you can call it like\nf = Foo()\nf.bar() #and you will see \"Hello\" on the screen\n\n"} {"input": "What is the common header format of Python files? I came across the following header format for Python source files in a document about Python coding guidelines:\n#!/usr/bin/env python\n\n\"\"\"Foobar.py: Description of what foobar does.\"\"\"\n\n__author__ = \"Barack Obama\"\n__copyright__ = \"Copyright 2009, Planet Earth\"\n\nIs this the standard format of headers in the Python world?\nWhat other fields/information can I put in the header?\nPython gurus share your guidelines for good Python source headers :-)\n", "output": "Its all metadata for the Foobar module.\nThe first one is the docstring of the module, that is already explained in Peter's answer.\n\nHow do I organize my modules (source files)? (Archive)\nThe first line of each file shoud be #!/usr/bin/env python. This makes it possible to run the file as a script invoking the interpreter implicitly, e.g. in a CGI context.\nNext should be the docstring with a description. If the description is long, the first line should be a short summary that makes sense on its own, separated from the rest by a newline.\nAll code, including import statements, should follow the docstring. Otherwise, the docstring will not be recognized by the interpreter, and you will not have access to it in interactive sessions (i.e. through obj.__doc__) or when generating documentation with automated tools.\nImport built-in modules first, followed by third-party modules, followed by any changes to the path and your own modules. Especially, additions to the path and names of your modules are likely to change rapidly: keeping them in one place makes them easier to find.\nNext should be authorship information. This information should follow this format:\n__author__ = \"Rob Knight, Gavin Huttley, and Peter Maxwell\"\n__copyright__ = \"Copyright 2007, The Cogent Project\"\n__credits__ = [\"Rob Knight\", \"Peter Maxwell\", \"Gavin Huttley\",\n \"Matthew Wakefield\"]\n__license__ = \"GPL\"\n__version__ = \"1.0.1\"\n__maintainer__ = \"Rob Knight\"\n__email__ = \"rob@spot.colorado.edu\"\n__status__ = \"Production\"\n\nStatus should typically be one of \"Prototype\", \"Development\", or \"Production\". __maintainer__ should be the person who will fix bugs and make improvements if imported. __credits__ differs from __author__ in that __credits__ includes people who reported bug fixes, made suggestions, etc. but did not actually write the code.\n\nHere you have more information, listing __author__, __authors__, __contact__, __copyright__, __license__, __deprecated__, __date__ and __version__ as recognized metadata.\n"} {"input": "Python style: multiple-line conditions in IFs Sometimes I break long conditions in IFs to several lines. The most obvious way to do this is:\n if (cond1 == 'val1' and cond2 == 'val2' and\n cond3 == 'val3' and cond4 == 'val4'):\n do_something\n\nIsn't very very appealing visually, because the action blends with the conditions. However, it is the natural way using correct Python indentation of 4 spaces.\nFor the moment I'm using:\n if ( cond1 == 'val1' and cond2 == 'val2' and\n cond3 == 'val3' and cond4 == 'val4'):\n do_something\n\nBut this isn't very pretty. :-)\nCan you recommend an alternative way ?\n", "output": "You don't need to use 4 spaces on your second conditional line. Maybe use:\nif (cond1 == 'val1' and cond2 == 'val2' and \n cond3 == 'val3' and cond4 == 'val4'):\n do_something\n\nAlso, don't forget the whitespace is more flexible than you might think:\nif ( \n cond1 == 'val1' and cond2 == 'val2' and \n cond3 == 'val3' and cond4 == 'val4'\n ):\n do_something\nif (cond1 == 'val1' and cond2 == 'val2' and \n cond3 == 'val3' and cond4 == 'val4'):\n do_something\n\nBoth of those are fairly ugly though.\nMaybe lose the brackets (the Style Guide discourages this though)?\nif cond1 == 'val1' and cond2 == 'val2' and \\\n cond3 == 'val3' and cond4 == 'val4':\n do_something\n\nThis at least gives you some differentiation.\nOr even:\nif cond1 == 'val1' and cond2 == 'val2' and \\\n cond3 == 'val3' and \\\n cond4 == 'val4':\n do_something\n\nI think I prefer:\nif cond1 == 'val1' and \\\n cond2 == 'val2' and \\\n cond3 == 'val3' and \\\n cond4 == 'val4':\n do_something\n\nHere's the Style Guide, which (since 2010) recommends using brackets.\n"} {"input": "How do I do a case insensitive string comparison in Python? What's the best way to do case insensitive string comparison in Python?\nI would like to encapsulate comparison of a regular strings to a repository string using in a very simple and pythonic way. I also would like to have ability to look up values in a dict hashed by strings using regular python strings. Much obliged for advice.\n", "output": "Assuming ASCII strings:\nstring1 = 'Hello'\nstring2 = 'hello'\n\nif string1.lower() == string2.lower():\n print \"The strings are the same (case insensitive)\"\nelse:\n print \"The strings are not the same (case insensitive)\"\n\n"} {"input": "Python reverse / invert a mapping Given a dictionary like so:\nmy_map = { 'a': 1, 'b':2 }\n\nHow can one invert this map to get:\ninv_map = { 1: 'a', 2: 'b' }\n\nEDITOR NOTE: map changed to my_map to avoid conflicts with the built-in function, map. Some comments may be affected below.\n", "output": "For Python 2.7.x\ninv_map = {v: k for k, v in my_map.iteritems()}\n\nFor Python 3+:\ninv_map = {v: k for k, v in my_map.items()}\n\n"} {"input": "Command Line Arguments In Python I am originally a C programmer. I have seen numerous tricks and \"hacks\" to read many different arguments. \nWhat are some of the ways Python programmers can do this?\nRelated\n\nWhat\u00e2\u0080\u0099s the best way to grab/parse command line arguments passed to a Python script?\nImplementing a \u00e2\u0080\u009c[command] [action] [parameter]\u00e2\u0080\u009d style command-line interfaces?\nHow can I process command line arguments in Python?\nHow do I format positional argument help using Python\u00e2\u0080\u0099s optparse?\n\n", "output": "import sys\n\nprint \"\\n\".join(sys.argv)\n\nsys.argv is a list that contains all the arguments passed to the script on the command line.\nBasically,\nimport sys\nprint sys.argv[1:]\n\n"} {"input": "How to capitalize the first letter of each word in a string (Python)? s = 'the brown fox'\n\n...do something here...\ns should be :\n'The Brown Fox'\n\nWhat's the easiest way to do this?\n", "output": "The .title() method of a string (either ASCII or Unicode is fine) does this:\n>>> \"hello world\".title()\n'Hello World'\n>>> u\"hello world\".title()\nu'Hello World'\n\nHowever, look out for strings with embedded apostrophes, as noted in the docs.\n\nThe algorithm uses a simple language-independent definition of a word as groups of consecutive letters. The definition works in many contexts but it means that apostrophes in contractions and possessives form word boundaries, which may not be the desired result:\n>>> \"they're bill's friends from the UK\".title()\n\"They'Re Bill'S Friends From The Uk\"\n\n\n"} {"input": "What is the quickest way to HTTP GET in Python? What is the quickest way to HTTP GET in Python if I know the Content will be a string? I am searching the docs for a quick one-liner like:\ncontents = url.get(\"http://example.com/foo/bar\")\n\nBut all I can find using Google are httplib and urllib - and I am unable to find a shortcut in those libraries.\nDoes standard Python 2.5 have a shortcut in some form as above, or should I write a function url_get?\n\nI would prefer not to capture the output of shelling out to wget or curl.\n\n", "output": "Python 2.x:\nimport urllib2\nurllib2.urlopen(\"http://example.com/foo/bar\").read()\n\nPython 3.x:\nimport urllib.request\nurllib.request.urlopen(\"http://example.com/foo/bar\").read()\n\nDocumentation for urllib.request and read.\nHow is that?\n"} {"input": "How to check if type of a variable is string? Is there a way to check if the type of a variable in python is string.. like\nisinstance(x,int);\n\nfor integer values?\n", "output": "In Python 2.x, you would do\nisinstance(s, basestring)\n\nbasestring is the abstract superclass of str and unicode. It can be used to test whether an object is an instance of str or unicode.\nEquivalent code:\nisinstance(s, (str, unicode))\n\n\nIn Python 3.x, it would be\nisinstance(s, str)\n\n(basestring is not defined in Python 3.)\n"} {"input": "Why are there no ++ and --\u00e2\u0080\u008b operators in Python? Why are there no ++ and -- operators in Python?\n", "output": "It's not because it doesn't make sense; it makes perfect sense to define \"x++\" as \"x += 1, evaluating to the previous binding of x\".\nIf you want to know the original reason, you'll have to either wade through old Python mailing lists or ask somebody who was there (eg. Guido), but it's easy enough to justify after the fact:\nSimple increment and decrement aren't needed as much as in other languages. You don't write things like for(int i = 0; i < 10; ++i) in Python very often; instead you do things like for i in range(0, 10).\nSince it's not needed nearly as often, there's much less reason to give it its own special syntax; when you do need to increment, += is usually just fine.\nIt's not a decision of whether it makes sense, or whether it can be done--it does, and it can. It's a question of whether the benefit is worth adding to the core syntax of the language. Remember, this is four operators--postinc, postdec, preinc, predec, and each of these would need to have its own class overloads; they all need to be specified, and tested; it would add opcodes to the language (implying a larger, and therefore slower, VM engine); every class that supports a logical increment would need to implement them (on top of += and -=).\nThis is all redundant with += and -=, so it would become a net loss.\n"} {"input": "What does functools.wraps do? In a comment on the answer to another question, someone said they weren't sure what functools.wraps was doing. So I'm asking this question so that there will be a record of it on StackOverflow for future reference: what does functools.wraps do, exactly?\n", "output": "When you use a decorator, you're replacing one function with another. In other words, if you have a decorator\ndef logged(func):\n def with_logging(*args, **kwargs):\n print func.__name__ + \" was called\"\n return func(*args, **kwargs)\n return with_logging\n\nthen when you say\n@logged\ndef f(x):\n \"\"\"does some math\"\"\"\n return x + x * x\n\nit's exactly the same as saying\ndef f(x):\n \"\"\"does some math\"\"\"\n return x + x * x\nf = logged(f)\n\nand your function f is replaced with the function with_logging. Unfortunately, this means that if you then say\nprint f.__name__\n\nit will print with_logging because that's the name of your new function. In fact, if you look at the docstring for f, it will be blank because with_logging has no docstring, and so the docstring you wrote won't be there anymore. Also, if you look at the pydoc result for that function, it won't be listed as taking one argument x; instead it'll be listed as taking *args and **kwargs because that's what with_logging takes.\nIf using a decorator always meant losing this information about a function, it would be a serious problem. That's why we have functools.wraps. This takes a function used in a decorator and adds the functionality of copying over the function name, docstring, arguments list, etc. And since wraps is itself a decorator, the following code does the correct thing:\nfrom functools import wraps\ndef logged(func):\n @wraps(func)\n def with_logging(*args, **kwargs):\n print func.__name__ + \" was called\"\n return func(*args, **kwargs)\n return with_logging\n\n@logged\ndef f(x):\n \"\"\"does some math\"\"\"\n return x + x * x\n\nprint f.__name__ # prints 'f'\nprint f.__doc__ # prints 'does some math'\n\n"} {"input": "Should I put #! (shebang) in Python scripts, and what form should it take? Should I put the shebang in my Python scripts? In what form?\n#!/usr/bin/env python \n\nor\n#!/usr/local/bin/python\n\nAre these equally portable? Which form is used most?\nNote: the tornado project uses the shebang. On the other hand the Django project doesn't.\n", "output": "The shebang line in any script determines the script's ability to be executed like an standalone executable without typing python beforehand in the terminal or when double clicking it in a file manager(when configured properly). It isn't necessary but generally put there so when someone sees the file opened in an editor, they immediately know what they're looking at. However, which shebang line you use IS important;\nCorrect usage is:\n#!/usr/bin/env python\n\n#!/usr/bin/env python Usually defaults to python 2.7.latest, and the following defaults to 3.latest\n#!/usr/bin/env python3\n\nDO NOT Use:\n#!/usr/local/bin/python\n\n\n\"python may be installed at /usr/bin/python or /bin/python in those\n cases, the above #! will fail.\"\n\n--\"#!/usr/bin/env python\" vs \"#!/usr/local/bin/python\"\n"} {"input": "Why are some float < integer comparisons four times slower than others? When comparing floats to integers, some pairs of values take much longer to be evaluated than other values of a similar magnitude.\nFor example:\n>>> import timeit\n>>> timeit.timeit(\"562949953420000.7 < 562949953421000\") # run 1 million times\n0.5387085462592742\n\nBut if the float or integer is made smaller or larger by a certain amount, the comparison runs much more quickly:\n>>> timeit.timeit(\"562949953420000.7 < 562949953422000\") # integer increased by 1000\n0.1481498428446173\n>>> timeit.timeit(\"562949953423001.8 < 562949953421000\") # float increased by 3001.1\n0.1459577925548956\n\nChanging the comparison operator (e.g. using == or > instead) does not affect the times in any noticeable way. \nThis is not solely related to magnitude because picking larger or smaller values can result in faster comparisons, so I suspect it is down to some unfortunate way the bits line up. \nClearly, comparing these values is more than fast enough for most use cases. I am simply curious as to why Python seems to struggle more with some pairs of values than with others.\n", "output": "A comment in the Python source code for float objects acknowledges that:\n\nComparison is pretty much a nightmare\n\nThis is especially true when comparing a float to an integer, because, unlike floats, integers in Python can be arbitrarily large and are always exact. Trying to cast the integer to a float might lose precision and make the comparison inaccurate. Trying to cast the float to an integer is not going to work either because any fractional part will be lost.\nTo get around this problem, Python performs a series of checks, returning the result if one of the checks succeeds. It compares the signs of the two values, then whether the integer is \"too big\" to be a float, then compares the exponent of the float to the length of the integer. If all of these checks fail, it is necessary to construct two new Python objects to compare in order to obtain the result.\nWhen comparing a float v to an integer/long w, the worst case is that:\n\nv and w have the same sign (both positive or both negative),\nthe integer w has few enough bits that it can be held in the size_t type (typically 32 or 64 bits),\nthe integer w has at least 49 bits,\nthe exponent of the float v is the same as the number of bits in w.\n\nAnd this is exactly what we have for the values in the question:\n>>> import math\n>>> math.frexp(562949953420000.7) # gives the float's (significand, exponent) pair\n(0.9999999999976706, 49)\n>>> (562949953421000).bit_length()\n49\n\nWe see that 49 is both the exponent of the float and the number of bits in the integer. Both numbers are positive and so the four criteria above are met.\nChoosing one of the values to be larger (or smaller) can change the number of bits of the integer, or the value of the exponent, and so Python is able to determine the result of the comparison without performing the expensive final check.\nThis is specific to the CPython implementation of the language.\n\nThe comparison in more detail\nThe float_richcompare function handles the comparison between two values v and w.\nBelow is a step-by-step description of the checks that the function performs. The comments in the Python source are actually very helpful when trying to understand what the function does, so I've left them in where relevant. I've also summarised these checks in a list at the foot of the answer.\nThe main idea is to map the Python objects v and w to two appropriate C doubles, i and j, which can then be easily compared to give the correct result. Both Python 2 and Python 3 use the same ideas to do this (the former just handles int and long types separately).\nThe first thing to do is check that v is definitely a Python float and map it to a C double i. Next the function looks at whether w is also a float and maps it to a C double j. This is the best case scenario for the function as all the other checks can be skipped. The function also checks to see whether v is inf or nan: \nstatic PyObject*\nfloat_richcompare(PyObject *v, PyObject *w, int op)\n{\n double i, j;\n int r = 0;\n assert(PyFloat_Check(v)); \n i = PyFloat_AS_DOUBLE(v); \n\n if (PyFloat_Check(w)) \n j = PyFloat_AS_DOUBLE(w); \n\n else if (!Py_IS_FINITE(i)) {\n if (PyLong_Check(w))\n j = 0.0;\n else\n goto Unimplemented;\n }\n\nNow we know that if w failed these checks, it is not a Python float. Now the function checks if it's a Python integer. If this is the case, the easiest test is to extract the sign of v and the sign of w (return 0 if zero, -1 if negative, 1 if positive). If the signs are different, this is all the information needed to return the result of the comparison:\n else if (PyLong_Check(w)) {\n int vsign = i == 0.0 ? 0 : i < 0.0 ? -1 : 1;\n int wsign = _PyLong_Sign(w);\n size_t nbits;\n int exponent;\n\n if (vsign != wsign) {\n /* Magnitudes are irrelevant -- the signs alone\n * determine the outcome.\n */\n i = (double)vsign;\n j = (double)wsign;\n goto Compare;\n }\n } \n\nIf this check failed, then v and w have the same sign. \nThe next check counts the number of bits in the integer w. If it has too many bits then it can't possibly be held as a float and so must be larger in magnitude than the float v:\n nbits = _PyLong_NumBits(w);\n if (nbits == (size_t)-1 && PyErr_Occurred()) {\n /* This long is so large that size_t isn't big enough\n * to hold the # of bits. Replace with little doubles\n * that give the same outcome -- w is so large that\n * its magnitude must exceed the magnitude of any\n * finite float.\n */\n PyErr_Clear();\n i = (double)vsign;\n assert(wsign != 0);\n j = wsign * 2.0;\n goto Compare;\n }\n\nOn the other hand, if the integer w has 48 or fewer bits, it can safely turned in a C double j and compared:\n if (nbits <= 48) {\n j = PyLong_AsDouble(w);\n /* It's impossible that <= 48 bits overflowed. */\n assert(j != -1.0 || ! PyErr_Occurred());\n goto Compare;\n }\n\nFrom this point onwards, we know that w has 49 or more bits. It will be convenient to treat w as a positive integer, so change the sign and the comparison operator as necessary:\n if (nbits <= 48) {\n /* \"Multiply both sides\" by -1; this also swaps the\n * comparator.\n */\n i = -i;\n op = _Py_SwappedOp[op];\n }\n\nNow the function looks at the exponent of the float. Recall that a float can be written (ignoring sign) as significand * 2exponent and that the significand represents a number between 0.5 and 1:\n (void) frexp(i, &exponent);\n if (exponent < 0 || (size_t)exponent < nbits) {\n i = 1.0;\n j = 2.0;\n goto Compare;\n }\n\nThis checks two things. If the exponent is less than 0 then the float is smaller than 1 (and so smaller in magnitude than any integer). Or, if the exponent is less than the number of bits in w then we have that v < |w| since significand * 2exponent is less than 2nbits. \nFailing these two checks, the function looks to see whether the exponent is greater than the number of bit in w. This shows that significand * 2exponent is greater than 2nbits and so v > |w|:\n if ((size_t)exponent > nbits) {\n i = 2.0;\n j = 1.0;\n goto Compare;\n }\n\nIf this check did not succeed we know that the exponent of the float v is the same as the number of bits in the integer w.\nThe only way that the two values can be compared now is to construct two new Python integers from v and w. The idea is to discard the fractional part of v, double the integer part, and then add one. w is also doubled and these two new Python objects can be compared to give the correct return value. Using an example with small values, 4.65 < 4 would be determined by the comparison (2*4)+1 == 9 < 8 == (2*4) (returning false).\n {\n double fracpart;\n double intpart;\n PyObject *result = NULL;\n PyObject *one = NULL;\n PyObject *vv = NULL;\n PyObject *ww = w;\n\n // snip\n\n fracpart = modf(i, &intpart); // split i (the double that v mapped to)\n vv = PyLong_FromDouble(intpart);\n\n // snip\n\n if (fracpart != 0.0) {\n /* Shift left, and or a 1 bit into vv\n * to represent the lost fraction.\n */\n PyObject *temp;\n\n one = PyLong_FromLong(1);\n\n temp = PyNumber_Lshift(ww, one); // left-shift doubles an integer\n ww = temp;\n\n temp = PyNumber_Lshift(vv, one);\n vv = temp;\n\n temp = PyNumber_Or(vv, one); // a doubled integer is even, so this adds 1\n vv = temp;\n }\n // snip\n }\n}\n\nFor brevity I've left out the additional error-checking and garbage-tracking Python has to do when it creates these new objects. Needless to say, this adds additional overhead and explains why the values highlighted in the question are significantly slower to compare than others.\n\nHere is a summary of the checks that are performed by the comparison function.\nLet v be a float and cast it as a C double. Now, if w is also a float:\n\nCheck whether w is nan or inf. If so, handle this special case separately depending on the type of w.\nIf not, compare v and w directly by their representations as C doubles.\n\nIf w is an integer:\n\nExtract the signs of v and w. If they are different then we know v and w are different and which is the greater value.\n(The signs are the same.) Check whether w has too many bits to be a float (more than size_t). If so, w has greater magnitude than v.\nCheck if w has 48 or fewer bits. If so, it can be safely cast to a C double without losing its precision and compared with v.\n(w has more than 48 bits. We will now treat w as a positive integer having changed the compare op as appropriate.) \nConsider the exponent of the float v. If the exponent is negative, then v is less than 1 and therefore less than any positive integer. Else, if the exponent is less than the number of bits in w then it must be less than w.\nIf the exponent of v is greater than the number of bits in w then v is greater than w.\n(The exponent is the same as the number of bits in w.)\nThe final check. Split v into its integer and fractional parts. Double the integer part and add 1 to compensate for the fractional part. Now double the integer w. Compare these two new integers instead to get the result.\n\n"} {"input": "\"is\" operator behaves unexpectedly with integers Why does the following behave unexpectedly in Python?\n>>> a = 256\n>>> b = 256\n>>> a is b\nTrue # This is an expected result\n>>> a = 257\n>>> b = 257\n>>> a is b\nFalse # What happened here? Why is this False?\n>>> 257 is 257\nTrue # Yet the literal numbers compare properly\n\nI am using Python 2.5.2. Trying some different versions of Python, it appears that Python 2.3.3 shows the above behaviour between 99 and 100.\nBased on the above, I can hypothesize that Python is internally implemented such that \"small\" integers are stored in a different way than larger integers and the is operator can tell the difference. Why the leaky abstraction? What is a better way of comparing two arbitrary objects to see whether they are the same when I don't know in advance whether they are numbers or not?\n", "output": "Take a look at this:\n>>> a = 256\n>>> b = 256\n>>> id(a)\n9987148\n>>> id(b)\n9987148\n>>> a = 257\n>>> b = 257\n>>> id(a)\n11662816\n>>> id(b)\n11662828\n\nEDIT: Here's what I found in the Python 2 documentation, \"Plain Integer Objects\" (It's the same for Python 3):\n\nThe current implementation keeps an\n array of integer objects for all\n integers between -5 and 256, when you\n create an int in that range you\n actually just get back a reference to\n the existing object. So it should be\n possible to change the value of 1. I\n suspect the behaviour of Python in\n this case is undefined. :-)\n\n"} {"input": "What does Ruby have that Python doesn't, and vice versa? There is a lot of discussions of Python vs Ruby, and I all find them completely unhelpful, because they all turn around why feature X sucks in language Y, or that claim language Y doesn't have X, although in fact it does. I also know exactly why I prefer Python, but that's also subjective, and wouldn't help anybody choosing, as they might not have the same tastes in development as I do.\nIt would therefore be interesting to list the differences, objectively. So no \"Python's lambdas sucks\". Instead explain what Ruby's lambdas can do that Python's can't. No subjectivity. Example code is good!\nDon't have several differences in one answer, please. And vote up the ones you know are correct, and down those you know are incorrect (or are subjective). Also, differences in syntax is not interesting. We know Python does with indentation what Ruby does with brackets and ends, and that @ is called self in Python.\nUPDATE: This is now a community wiki, so we can add the big differences here.\nRuby has a class reference in the class body\nIn Ruby you have a reference to the class (self) already in the class body. In Python you don't have a reference to the class until after the class construction is finished.\nAn example:\nclass Kaka\n puts self\nend\n\nself in this case is the class, and this code would print out \"Kaka\". There is no way to print out the class name or in other ways access the class from the class definition body in Python (outside method definitions).\nAll classes are mutable in Ruby\nThis lets you develop extensions to core classes. Here's an example of a rails extension:\nclass String\n def starts_with?(other)\n head = self[0, other.length]\n head == other\n end\nend\n\nPython (imagine there were no ''.startswith method):\ndef starts_with(s, prefix):\n return s[:len(prefix)] == prefix\n\nYou could use it on any sequence (not just strings). In order to use it you should import it explicitly e.g., from some_module import starts_with.\nRuby has Perl-like scripting features\nRuby has first class regexps, $-variables, the awk/perl line by line input loop and other features that make it more suited to writing small shell scripts that munge text files or act as glue code for other programs.\nRuby has first class continuations\nThanks to the callcc statement. In Python you can create continuations by various techniques, but there is no support built in to the language.\nRuby has blocks\nWith the \"do\" statement you can create a multi-line anonymous function in Ruby, which will be passed in as an argument into the method in front of do, and called from there. In Python you would instead do this either by passing a method or with generators.\nRuby:\namethod { |here|\n many=lines+of+code\n goes(here)\n}\n\nPython (Ruby blocks correspond to different constructs in Python):\nwith amethod() as here: # `amethod() is a context manager\n many=lines+of+code\n goes(here)\n\nOr\nfor here in amethod(): # `amethod()` is an iterable\n many=lines+of+code\n goes(here)\n\nOr\ndef function(here):\n many=lines+of+code\n goes(here)\n\namethod(function) # `function` is a callback\n\nInterestingly, the convenience statement in Ruby for calling a block is called \"yield\", which in Python will create a generator.\nRuby:\ndef themethod\n yield 5\nend\n\nthemethod do |foo|\n puts foo\nend\n\nPython:\ndef themethod():\n yield 5\n\nfor foo in themethod():\n print foo\n\nAlthough the principles are different, the result is strikingly similar.\nRuby supports functional style (pipe-like) programming more easily\nmyList.map(&:description).reject(&:empty?).join(\"\\n\")\n\nPython:\ndescriptions = (f.description() for f in mylist)\n\"\\n\".join(filter(len, descriptions))\n\nPython has built-in generators (which are used like Ruby blocks, as noted above)\nPython has support for generators in the language. In Ruby 1.8 you can use the generator module which uses continuations to create a generator from a block. Or, you could just use a block/proc/lambda! Moreover, in Ruby 1.9 Fibers are, and can be used as, generators, and the Enumerator class is a built-in generator 4\ndocs.python.org has this generator example:\ndef reverse(data):\n for index in range(len(data)-1, -1, -1):\n yield data[index]\n\nContrast this with the above block examples.\nPython has flexible name space handling\nIn Ruby, when you import a file with require, all the things defined in that file will end up in your global namespace. This causes namespace pollution. The solution to that is Rubys modules. But if you create a namespace with a module, then you have to use that namespace to access the contained classes.\nIn Python, the file is a module, and you can import its contained names with from themodule import *, thereby polluting the namespace if you want. But you can also import just selected names with from themodule import aname, another or you can simply import themodule and then access the names with themodule.aname. If you want more levels in your namespace you can have packages, which are directories with modules and an __init__.py file.\nPython has docstrings\nDocstrings are strings that are attached to modules, functions and methods and can be\nintrospected at runtime. This helps for creating such things as the help command and\nautomatic documentation.\ndef frobnicate(bar):\n \"\"\"frobnicate takes a bar and frobnicates it\n\n >>> bar = Bar()\n >>> bar.is_frobnicated()\n False\n >>> frobnicate(bar)\n >>> bar.is_frobnicated()\n True\n \"\"\"\n\nRuby's equivalent are similar to javadocs, and located above the method instead of within it. They can be retrieved at runtime from the files by using 1.9's Method#source_location example use\nPython has multiple inheritance\nRuby does not (\"on purpose\" -- see Ruby's website, see here how it's done in Ruby). It does reuse the module concept as a type of abstract classes.\nPython has list/dict comprehensions\nPython:\nres = [x*x for x in range(1, 10)]\n\nRuby:\nres = (0..9).map { |x| x * x }\n\nPython:\n>>> (x*x for x in range(10))\n at 0xb7c1ccd4>\n>>> list(_)\n[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]\n\nRuby:\np = proc { |x| x * x }\n(0..9).map(&p)\n\nPython 2.7+:\n>>> {x:str(y*y) for x,y in {1:2, 3:4}.items()}\n{1: '4', 3: '16'}\n\nRuby:\n>> Hash[{1=>2, 3=>4}.map{|x,y| [x,(y*y).to_s]}]\n=> {1=>\"4\", 3=>\"16\"}\n\nPython has decorators\nThings similar to decorators can also be created in Ruby, and it can also be argued that they aren't as necessary as in Python.\nSyntax differences\nRuby requires \"end\" or \"}\" to close all of its scopes, while Python uses white-space only. There have been recent attempts in Ruby to allow for whitespace only indentation http://github.com/michaeledgar/seamless\n", "output": "Ruby has the concepts of blocks, which are essentially syntactic sugar around a section of code; they are a way to create closures and pass them to another method which may or may not use the block. A block can be invoked later on through a yield statement.\nFor example, a simple definition of an each method on Array might be something like:\nclass Array\n def each\n for i in self \n yield(i) # If a block has been passed, control will be passed here.\n end \n end \nend\n\nThen you can invoke this like so:\n# Add five to each element.\n[1, 2, 3, 4].each{ |e| puts e + 5 }\n> [6, 7, 8, 9]\n\nPython has anonymous functions/closures/lambdas, but it doesn't quite have blocks since it's missing some of the useful syntactic sugar. However, there's at least one way to get it in an ad-hoc fashion. See, for example, here.\n"} {"input": "super() raises \"TypeError: must be type, not classobj\" for new-style class The following use of super() raises a TypeError: why?\n>>> from HTMLParser import HTMLParser\n>>> class TextParser(HTMLParser):\n... def __init__(self):\n... super(TextParser, self).__init__()\n... self.all_data = []\n... \n>>> TextParser()\n(...)\nTypeError: must be type, not classobj\n\nThere is a similar question on StackOverflow: python super() raises TypeError ! Why?, where the error is explained by the fact that the user class is not a new-style class. However, the class above is a new-style class, as it inherits from object:\n>>> isinstance(HTMLParser(), object)\nTrue\n\nWhat am I missing? How can I use super(), here?\nUsing HTMLParser.__init__(self) instead of super(TextParser, self).__init__() would work, but I would like to understand the TypeError.\nPS: Joachim pointed out that being a new-style-class instance is not equivalent to being an object. I read the opposite many times, hence my confusion (example of new-style class instance test based on object instance test: http://stackoverflow.com/revisions/2655651/3).\n", "output": "Alright, it's the usual \"super() cannot be used with an old-style class\".\nHowever, the important point is that the correct test for \"is this a new-style instance (i.e. object)?\" is\n>>> class OldStyle: pass\n>>> instance = OldStyle()\n>>> issubclass(instance.__class__, object)\nFalse\n\nand not (as in the question):\n>>> isinstance(instance, object)\nTrue\n\nFor classes, the correct \"is this a new-style class\" test is:\n>>> issubclass(OldStyle, object) # OldStyle is not a new-style class\nFalse\n>>> issubclass(int, object) # int is a new-style class\nTrue\n\nThe crucial point is that with old-style classes, the class of an instance and its type are distinct. Here, OldStyle().__class__ is OldStyle, which does not inherit from object, while type(OldStyle()) is the instance type, which does inherit from object. Basically, an old-style class just creates objects of type instance (whereas a new-style class creates objects whose type is the class itself). This is probably why the instance OldStyle() is an object: its type() inherits from object (the fact that its class does not inherit from object does not count: old-style classes merely construct new objects of type instance). Partial reference: http://stackoverflow.com/a/9699961/42973.\nPS: The difference between a new-style class and an old-style one can also be seen with:\n>>> type(OldStyle) # OldStyle creates objects but is not itself a type\nclassobj\n>>> isinstance(OldStyle, type)\nFalse\n>>> type(int) # A new-style class is a type\ntype\n\n(old-style classes are not types, so they cannot be the type of their instances).\n"} {"input": "Get difference between two lists I have two lists in Python, like these:\ntemp1 = ['One', 'Two', 'Three', 'Four']\ntemp2 = ['One', 'Two']\n\nI need to create a third list with items from the first list which aren't present in the second one. From the example I have to get:\ntemp3 = ['Three', 'Four']\n\nAre there any fast ways without cycles and checking?\n", "output": "In [5]: list(set(temp1) - set(temp2))\nOut[5]: ['Four', 'Three']\n\n"} {"input": "How do I get the path and name of the file that is currently executing? I have scripts calling other script files but I need to get the filepath of the file that is currently running within the process. \nFor example, let's say I have three files. Using execfile:\n\nscript_1.py calls script_2.py. \nIn turn, script_2.py calls script_3.py. \n\nHow can I get the file name and path of script_3.py, from code within script_3.py, without having to pass that information as arguments from script_2.py?\n(Executing os.getcwd() returns the original starting script's filepath not the current file's.)\n", "output": "__file__\n\nas others have said. You may want to use: \nos.path.realpath(__file__)\n\n"} {"input": "Pretty printing XML in Python What is the best way (or even the various ways) to pretty print xml in Python?\n", "output": "import xml.dom.minidom\n\nxml = xml.dom.minidom.parse(xml_fname) # or xml.dom.minidom.parseString(xml_string)\npretty_xml_as_string = xml.toprettyxml()\n\n"} {"input": "How to move a file in Python I looked into the Python os interface, but was unable to locate a method to move a file. How would I do the equivalent of $ mv ... in Python?\n>>> source_files = '/PATH/TO/FOLDER/*'\n>>> destination_folder = 'PATH/TO/FOLDER'\n>>> # equivalent of $ mv source_files destination_folder\n\n", "output": "os.rename() or shutil.move()\nBoth employ the same syntax:\nos.rename(\"path/to/current/file.foo\", \"path/to/new/desination/for/file.foo\")\nshutil.move(\"path/to/current/file.foo\", \"path/to/new/destination/for/file.foo\")\n\nNote that in both cases the directory in which the new file is being created must already exist, (but, on Windows, a file with that name must not or an exception will be raised). Note also, you must include the file name (file.foo) in both the source and destination arguments. If it is changed, the file will be renamed as well as moved.\nAs has been noted in comments on other answers, shutil.move simply calls os.rename in most cases. However, if the destination is on a different disk than the source, it will instead copy and then delete the source file.\n"} {"input": "How do I find the location of Python module sources? How do I learn where the source file for a given Python module is installed? Is the method different on Windows than on Linux?\nI'm trying to look for the source of the datetime module in particular, but I'm interested in a more general answer as well.\n", "output": "For a pure python module you can find the source by looking at themodule.__file__.\nThe datetime module, however, is written in C, and therefore datetime.__file__ points to a .so file (there is no datetime.__file__ on Windows), and therefore, you can't see the source.\nIf you download a python source tarball and extract it, the modules' code can be found in the Modules subdirectory.\nFor example, if you want to find the datetime code for python 2.6, you can look at\nPython-2.6/Modules/datetimemodule.c\n\nYou can also find the latest Mercurial version on the web at \nhttps://hg.python.org/cpython/file/tip/Modules/_datetimemodule.c\n"} {"input": "Measure time elapsed in Python? What I want is to start counting time somewhere in my code and then get the passed time, to measure the time it took to execute few function. I think I'm using the timeit module wrong, but the docs are just confusing for me.\nimport timeit\n\nstart = timeit.timeit()\nprint \"hello\"\nend = timeit.timeit()\nprint end - start\n\n", "output": "If you just want to measure the elapsed wall-clock time between two points, you could use time.time():\nimport time\n\nstart = time.time()\nprint(\"hello\")\nend = time.time()\nprint(end - start)\n\nThis gives the execution time in seconds.\nedit A better option might be to use time.clock (thanks @Amber):\n\nOn Unix, return the current processor time as a floating point number\n expressed in seconds. The precision, and in fact the very definition\n of the meaning of \u00e2\u0080\u009cprocessor time\u00e2\u0080\u009d, depends on that of the C function\n of the same name, but in any case, this is the function to use for\n benchmarking Python or timing algorithms.\nOn Windows, this function returns wall-clock seconds elapsed since the\n first call to this function, as a floating point number, based on the\n Win32 function QueryPerformanceCounter(). The resolution is typically\n better than one microsecond.\n\n"} {"input": "clang error: unknown argument: '-mno-fused-madd' (python package installation failure) I get the following error when attempting to install psycopg2 via pip on Mavericks 10.9:\nclang: error: unknown argument: '-mno-fused-madd' [-Wunused-command-line-argument-hard-error-in-future]\n\nNot sure how to proceed and have searched here and elsewhere for this particular error. Any help is much appreciated!\nHere is the complete output from pip:\n$ pip install psycopg2\nDownloading/unpacking psycopg2\n Downloading psycopg2-2.5.2.tar.gz (685kB): 685kB downloaded\n Running setup.py (path:/private/var/folders/0z/ljjwsjmn4v9_zwm81vhxj69m0000gn/T/pip_build_tino/psycopg2/setup.py) egg_info for package psycopg2\n\nInstalling collected packages: psycopg2\n Running setup.py install for psycopg2\n building 'psycopg2._psycopg' extension\n cc -fno-strict-aliasing -fno-common -dynamic -arch x86_64 -arch i386 -g -Os -pipe -fno-common -fno-strict-aliasing -fwrapv -mno-fused-madd -DENABLE_DTRACE -DMACOSX -DNDEBUG -Wall -Wstrict-prototypes -Wshorten-64-to-32 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch x86_64 -arch i386 -pipe -DPSYCOPG_DEFAULT_PYDATETIME=1 -DPSYCOPG_VERSION=\"2.5.2 (dt dec pq3 ext)\" -DPG_VERSION_HEX=0x090303 -DPSYCOPG_EXTENSIONS=1 -DPSYCOPG_NEW_BOOLEAN=1 -DHAVE_PQFREEMEM=1 -I/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -I. -I/usr/local/Cellar/postgresql/9.3.3/include -I/usr/local/Cellar/postgresql/9.3.3/include/server -c psycopg/psycopgmodule.c -o build/temp.macosx-10.9-intel-2.7/psycopg/psycopgmodule.o\n clang: error: unknown argument: '-mno-fused-madd' [-Wunused-command-line-argument-hard-error-in-future]\n clang: note: this will be a hard error (cannot be downgraded to a warning) in the future\n error: command 'cc' failed with exit status 1\n Complete output from command /usr/bin/python -c \"import setuptools, tokenize;__file__='/private/var/folders/0z/ljjwsjmn4v9_zwm81vhxj69m0000gn/T/pip_build_tino/psycopg2/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\\r\\n', '\\n'), __file__, 'exec'))\" install --record /var/folders/0z/ljjwsjmn4v9_zwm81vhxj69m0000gn/T/pip-bnWiwB-record/install-record.txt --single-version-externally-managed --compile:\n running install\n\nrunning build\n\nrunning build_py\n\ncreating build\n\ncreating build/lib.macosx-10.9-intel-2.7\n\ncreating build/lib.macosx-10.9-intel-2.7/psycopg2\n\ncopying lib/__init__.py -> build/lib.macosx-10.9-intel-2.7/psycopg2\n\ncopying lib/_json.py -> build/lib.macosx-10.9-intel-2.7/psycopg2\n\ncopying lib/_range.py -> build/lib.macosx-10.9-intel-2.7/psycopg2\n\ncopying lib/errorcodes.py -> build/lib.macosx-10.9-intel-2.7/psycopg2\n\ncopying lib/extensions.py -> build/lib.macosx-10.9-intel-2.7/psycopg2\n\ncopying lib/extras.py -> build/lib.macosx-10.9-intel-2.7/psycopg2\n\ncopying lib/pool.py -> build/lib.macosx-10.9-intel-2.7/psycopg2\n\ncopying lib/psycopg1.py -> build/lib.macosx-10.9-intel-2.7/psycopg2\n\ncopying lib/tz.py -> build/lib.macosx-10.9-intel-2.7/psycopg2\n\ncreating build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/__init__.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/dbapi20.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/dbapi20_tpc.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/test_async.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/test_bug_gc.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/test_bugX000.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/test_cancel.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/test_connection.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/test_copy.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/test_cursor.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/test_dates.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/test_extras_dictcursor.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/test_green.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/test_lobject.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/test_module.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/test_notify.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/test_psycopg2_dbapi20.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/test_quote.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/test_transaction.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/test_types_basic.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/test_types_extras.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/test_with.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/testconfig.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\ncopying tests/testutils.py -> build/lib.macosx-10.9-intel-2.7/psycopg2/tests\n\nrunning build_ext\n\nbuilding 'psycopg2._psycopg' extension\n\ncreating build/temp.macosx-10.9-intel-2.7\n\ncreating build/temp.macosx-10.9-intel-2.7/psycopg\n\ncc -fno-strict-aliasing -fno-common -dynamic -arch x86_64 -arch i386 -g -Os -pipe -fno-common -fno-strict-aliasing -fwrapv -mno-fused-madd -DENABLE_DTRACE -DMACOSX -DNDEBUG -Wall -Wstrict-prototypes -Wshorten-64-to-32 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch x86_64 -arch i386 -pipe -DPSYCOPG_DEFAULT_PYDATETIME=1 -DPSYCOPG_VERSION=\"2.5.2 (dt dec pq3 ext)\" -DPG_VERSION_HEX=0x090303 -DPSYCOPG_EXTENSIONS=1 -DPSYCOPG_NEW_BOOLEAN=1 -DHAVE_PQFREEMEM=1 -I/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -I. -I/usr/local/Cellar/postgresql/9.3.3/include -I/usr/local/Cellar/postgresql/9.3.3/include/server -c psycopg/psycopgmodule.c -o build/temp.macosx-10.9-intel-2.7/psycopg/psycopgmodule.o\n\nclang: error: unknown argument: '-mno-fused-madd' [-Wunused-command-line-argument-hard-error-in-future]\n\nclang: note: this will be a hard error (cannot be downgraded to a warning) in the future\n\nerror: command 'cc' failed with exit status 1\n\n----------------------------------------\nCleaning up...\nCommand /usr/bin/python -c \"import setuptools, tokenize;__file__='/private/var/folders/0z/ljjwsjmn4v9_zwm81vhxj69m0000gn/T/pip_build_tino/psycopg2/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\\r\\n', '\\n'), __file__, 'exec'))\" install --record /var/folders/0z/ljjwsjmn4v9_zwm81vhxj69m0000gn/T/pip-bnWiwB-record/install-record.txt --single-version-externally-managed --compile failed with error code 1 in /private/var/folders/0z/ljjwsjmn4v9_zwm81vhxj69m0000gn/T/pip_build_tino/psycopg2\n\n", "output": "You can tell clang to not raise this as an error by setting the following environment variables prior compilation:\nexport CFLAGS=-Qunused-arguments\nexport CPPFLAGS=-Qunused-arguments\n\nThen pip install psycopg2should work.\nI had the same when trying to pip install lxml.\nEdit: if you are installing as superuser (which will likely be the case if you are trying to append to /Library/Python/2.7/site-packages, the native Apple factory-installed Python distribution which ships with OS X, rather than to some other Python distribution which you have subsequently installed yourself), then you will need to do, as described by @Thijs Kuipers in comments below:\nsudo -E pip install psycopg2\n\nor the equivalent, for whatever other package name you may be substituting in place of psycopg2.\nUPDATE [2014-05-16]: Apple has fixed this problem with updated system Pythons (2.7, 2.6, and 2.5) in OS X 10.9.3 so the workaround is no longer necessary when using the latest Mavericks and Xcode 5.1+. However, as of now, the workaround is still required for OS X 10.8.x (Mountain Lion, currently 10.8.5) if you are using Xcode 5.1+ there.\n"} {"input": "Dead simple argparse example wanted: 1 argument, 3 results The documentation for the argparse python module, while excellent I'm sure, is too much for my tiny beginner brain to grasp right now. Likewise for many questions here on the stack. I don't need to do math on the command line or meddle with formatting lines on the screen or change option characters. All I want to do is \"if arg is A, do this, if B do that, if none of the above show help and quit\".\n", "output": "Here's the way I do it with argparse (with multiple args):\nparser = argparse.ArgumentParser(description='Description of your program')\nparser.add_argument('-f','--foo', help='Description for foo argument', required=True)\nparser.add_argument('-b','--bar', help='Description for bar argument', required=True)\nargs = vars(parser.parse_args())\n\nargs will be a dictionary containing the arguments:\nif args['foo'] == 'Hello':\n # code here\n\nif args['bar'] == 'World':\n # code here\n\nIn your case simply add only one argument.\n"} {"input": "Python Image Library fails with message \"decoder JPEG not available\" - PIL PIL does support JPEG in my system.\nWhenever I do an upload, my code is failing with:\nFile \"PIL/Image.py\", line 375, in _getdecoder\n raise IOError(\"decoder %s not available\" % decoder_name)\nIOError: decoder jpeg not available\n\nHow can I resolve this?\n", "output": "libjpeg-dev is required to be able to process jpegs with pillow (or PIL), so you need to install it and then recompile pillow. It also seems that libjpeg8-dev is needed on Ubuntu 14.04\nIf you're still using PIL then you should really be using pillow these days though, so first pip uninstall PIL before following these instructions to switch, or if you have a good reason for sticking with PIL then replace \"pillow\" with \"PIL\" in the below).\nOn Ubuntu:\n# install libjpeg-dev with apt\nsudo apt-get install libjpeg-dev\n# if you're on Ubuntu 14.04, also install this\nsudo apt-get install libjpeg8-dev\n\n# reinstall pillow\npip install --no-cache-dir -I pillow\n\nIf that doesn't work, try one of the below, depending on whether you are on 64bit or 32bit Ubuntu.\nFor Ubuntu x64:\nsudo ln -s /usr/lib/x86_64-linux-gnu/libjpeg.so /usr/lib\nsudo ln -s /usr/lib/x86_64-linux-gnu/libfreetype.so /usr/lib\nsudo ln -s /usr/lib/x86_64-linux-gnu/libz.so /usr/lib\n\nOr for Ubuntu 32bit:\nsudo ln -s /usr/lib/i386-linux-gnu/libjpeg.so /usr/lib/\nsudo ln -s /usr/lib/i386-linux-gnu/libfreetype.so.6 /usr/lib/\nsudo ln -s /usr/lib/i386-linux-gnu/libz.so /usr/lib/\n\nThen reinstall pillow:\npip install --no-cache-dir -I pillow\n\n(Edits to include feedback from comments. Thanks Charles Offenbacher for pointing out this differs for 32bit, and t-mart for suggesting use of --no-cache-dir).\n"} {"input": "Convert a Unicode string to a string in Python (containing extra symbols) How do you convert a Unicode string (containing extra characters like \u00c2\u00a3 $, etc.) into a Python string?\n", "output": "title = u\"Kl\u00c3\u00bcft skr\u00c3\u00a4ms inf\u00c3\u00b6r p\u00c3\u00a5 f\u00c3\u00a9d\u00c3\u00a9ral \u00c3\u00a9lectoral gro\u00c3\u009fe\"\nimport unicodedata\nunicodedata.normalize('NFKD', title).encode('ascii','ignore')\n'Kluft skrams infor pa federal electoral groe'\n\n"} {"input": "Converting from a string to boolean in Python? Does anyone know how to do convert from a string to a boolean in Python? I found this link. But it doesn't look like a proper way to do it. I.e. using a built in functionality, etc.\nEDIT: The reason I asked this is because I learned int(\"string\"), from here. I tried bool (\"string\") but always got True.\n", "output": "Really, you just compare the string to whatever you expect to accept as representing true, so you can do this:\ns == 'True'\n\nOr to checks against a whole bunch of values:\ns in ['true', '1', 't', 'y', 'yes', 'yeah', 'yup', 'certainly', 'uh-huh']\n\nBe cautious when using the following:\n>>> bool(\"foo\")\nTrue\n>>> bool(\"\")\nFalse\n\nEmpty strings evaluate to False, but everything else evaluates to True. So this should not be used for any kind of parsing purposes.\n"} {"input": "Python: check if an object is a list or tuple (but not string) This is what I normally do in order to ascertain that the input is a list/tuple - but not a str. Because many times I stumbled upon bugs where a function passes a str object by mistake, and the target function does for x in lst assuming that lst is actually a list or tuple.\nassert isinstance(lst, (list, tuple))\n\nMy question is: is there a better way of achieving this?\n", "output": "I think\nassert not isinstance(lst, basestring)\n\nIs actually what you want, otherwise you'll miss out on a lot of things which act like lists, but aren't subclasses of list or tuple.\n"} {"input": "How do I use Python's itertools.groupby()? I haven't been able to find an understandable explanation of how to actually use Python's itertools.groupby() function. What I'm trying to do is this:\n\nTake a list - in this case, the children of an objectified lxml element\nDivide it into groups based on some criteria\nThen later iterate over each of these groups separately.\n\nI've reviewed the documentation, and the examples, but I've had trouble trying to apply them beyond a simple list of numbers. \nSo, how do I use of itertools.groupby()? Is there another technique I should be using? Pointers to good \"prerequisite\" reading would also be appreciated.\n", "output": "As Sebastjan said, you first have to sort your data. This is important.\nThe part I didn't get is that in the example construction\ngroups = []\nuniquekeys = []\nfor k, g in groupby(data, keyfunc):\n groups.append(list(g)) # Store group iterator as a list\n uniquekeys.append(k)\n\nk is the current grouping key, and g is an iterator that you can use to iterate over the group defined by that grouping key. In other words, the groupby iterator itself returns iterators.\nHere's an example of that, using clearer variable names:\nfrom itertools import groupby\n\nthings = [(\"animal\", \"bear\"), (\"animal\", \"duck\"), (\"plant\", \"cactus\"), (\"vehicle\", \"speed boat\"), (\"vehicle\", \"school bus\")]\n\nfor key, group in groupby(things, lambda x: x[0]):\n for thing in group:\n print \"A %s is a %s.\" % (thing[1], key)\n print \" \"\n\nThis will give you the output:\n\nA bear is a animal.\n A duck is a animal.\nA cactus is a plant.\nA speed boat is a vehicle.\n A school bus is a vehicle.\n\nIn this example, things is a list of tuples where the first item in each tuple is the group the second item belongs to. \nThe groupby() function takes two arguments: (1) the data to group and (2) the function to group it with. \nHere, lambda x: x[0] tells groupby() to use the first item in each tuple as the grouping key.\nIn the above for statement, groupby returns three (key, group iterator) pairs - once for each unique key. You can use the returned iterator to iterate over each individual item in that group.\nHere's a slightly different example with the same data, using a list comprehension:\nfor key, group in groupby(things, lambda x: x[0]):\n listOfThings = \" and \".join([thing[1] for thing in group])\n print key + \"s: \" + listOfThings + \".\"\n\nThis will give you the output:\n\nanimals: bear and duck.\n plants: cactus.\n vehicles: speed boat and school bus.\n\n"} {"input": "Flatten (an irregular) list of lists in Python Yes, I know this subject has been covered before (here, here, here, here), but as far as I know, all solutions, except for one, fail on a list like this:\nL = [[[1, 2, 3], [4, 5]], 6]\n\nWhere the desired output is\n[1, 2, 3, 4, 5, 6]\n\nOr perhaps even better, an iterator. The only solution I saw that works for an arbitrary nesting is found in this question:\ndef flatten(x):\n result = []\n for el in x:\n if hasattr(el, \"__iter__\") and not isinstance(el, basestring):\n result.extend(flatten(el))\n else:\n result.append(el)\n return result\n\nflatten(L)\n\nIs this the best model? Did I overlook something? Any problems?\n", "output": "Using generator functions can make your example a little easier to read and probably boost the performance.\nPython 2\ndef flatten(l):\n for el in l:\n if isinstance(el, collections.Iterable) and not isinstance(el, basestring):\n for sub in flatten(el):\n yield sub\n else:\n yield el\n\nI used the Iterable ABC added in 2.6.\nPython 3\nIn Python 3, the basestring is no more, but you can use a tuple of str and bytes to get the same effect there.\nThe yield from operator returns an item from a generator one at a time. This syntax for delegating to a subgenerator was added in 3.3\ndef flatten(l):\n for el in l:\n if isinstance(el, collections.Iterable) and not isinstance(el, (str, bytes)):\n yield from flatten(el)\n else:\n yield el\n\n"} {"input": "How to read large file, line by line in python I want to iterate over each line of an entire file. One way to do this is by reading the entire file, saving it to a list, then going over the line of interest. This method uses a lot of memory, so I am looking for an alternative.\nMy code so far:\nfor each_line in fileinput.input(input_file):\n do_something(each_line)\n\n for each_line_again in fileinput.input(input_file):\n do_something(each_line_again)\n\nExecuting this code gives an error message: device active.\nAny suggestions?\nEDIT: The purpose is to calculate pair-wise string similarity, meaning for each line in file, I want to calculate the Levenshtein distance with every other line.\n", "output": "Nobody has given the correct, fully Pythonic way to read a file. It's the following:\nwith open(...) as f:\n for line in f:\n \n\nThe with statement handles opening and closing the file, including if an exception is raised in the inner block. The for line in f treats the file object f as an iterable, which automatically uses buffered IO and memory management so you don't have to worry about large files.\n\nThere should be one -- and preferably only one -- obvious way to do it.\n\n"} {"input": "How to sort (list/tuple) of lists/tuples? I have some data either in list contains lists, or list contains tuples.\ndata = [[1,2,3], [4,5,6], [7,8,9]]\ndata = [(1,2,3), (4,5,6), (7,8,9)]\n\nAnd I want to sort by the 2nd element in the subset. Meaning, sorting by 2,5,8 where 2 is from (1,2,3), 5 is from (4,5,6). What is the common way to do this? Should I store tuples or lists in my list? Since tuples are more inflexible.\n", "output": "sorted_by_second = sorted(data, key=lambda tup: tup[1])\n\nor:\ndata.sort(key=lambda tup: tup[1]) # sorts in place\n\n"} {"input": "How to re import an updated package while in Python Interpreter? I often test my module in the Python Interpreter, and when I see an error, I quickly update the .py file. But how do I make it reflect on the Interpreter ? So, far I have been exiting and reentering the Interpreter because re importing the file again is not working for me.\n", "output": "\"Reload is no longer a function in Python 3. Use imp.reload() instead\" copied from comments\nUse the reload builtin function:\nhttps://docs.python.org/2/library/functions.html#reload\n\nWhen reload(module) is executed:\n\nPython modules\u00e2\u0080\u0099 code is recompiled and the module-level code reexecuted, defining a new set of objects which are bound to names in the module\u00e2\u0080\u0099s dictionary. The init function of extension modules is not called a second time.\nAs with all other objects in Python the old objects are only reclaimed after their reference counts drop to zero.\nThe names in the module namespace are updated to point to any new or changed objects.\nOther references to the old objects (such as names external to the module) are not rebound to refer to the new objects and must be updated in each namespace where they occur if that is desired.\n\n\nExample:\n# Make a simple function that prints \"version 1\"\nshell1$ echo 'def x(): print \"version 1\"' > mymodule.py\n\n# Run the module\nshell2$ python\n>>> import mymodule\n>>> mymodule.x()\nversion 1\n\n# Change mymodule to print \"version 2\" (without exiting the python REPL)\nshell2$ echo 'def x(): print \"version 2\"' > mymodule.py\n\n# Back in that same python session\n>>> reload(mymodule)\n\n>>> mymodule.x()\nversion 2\n\n"} {"input": "What is the purpose of the single underscore \"_\" variable in Python? What is the meaning of _ after for in this code?\nif tbh.bag:\n n = 0\n for _ in tbh.bag.atom_set():\n n += 1\n\n", "output": "_ has 3 main conventional uses in Python:\n\nTo hold the result of the last executed statement in an interactive\ninterpreter session. This precedent was set by the standard CPython\ninterpreter, and other interpreters have followed suit\nFor translation lookup in i18n (imported from the corresponding C\nconventions, I believe), as in code like: raise\nforms.ValidationError(_(\"Please enter a correct username\"))\nAs a general purpose \"throwaway\" variable name to indicate that part\nof a function result is being deliberately ignored, as in code like:\nlabel, has_label, _ = text.partition(':')\n\nThe latter two purposes can conflict, so it is necessary to avoid using _ as a throwaway variable in any code block that also uses it for i18n translation (many folks prefer a double-underscore, __, as their throwaway variable for exactly this reason).\n"} {"input": "Fastest way to list all primes below N This is the best algorithm I could come up.\ndef get_primes(n):\n numbers = set(range(n, 1, -1))\n primes = []\n while numbers:\n p = numbers.pop()\n primes.append(p)\n numbers.difference_update(set(range(p*2, n+1, p)))\n return primes\n\n>>> timeit.Timer(stmt='get_primes.get_primes(1000000)', setup='import get_primes').timeit(1)\n1.1499958793645562\n\nCan it be made even faster?\nThis code has a flaw: Since numbers is an unordered set, there is no guarantee that numbers.pop() will remove the lowest number from the set. Nevertheless, it works (at least for me) for some input numbers:\n>>> sum(get_primes(2000000))\n142913828922L\n#That's the correct sum of all numbers below 2 million\n>>> 529 in get_primes(1000)\nFalse\n>>> 529 in get_primes(530)\nTrue\n\n", "output": "Warning: timeit results may vary due to differences in hardware or\nversion of Python. \nBelow is a script which compares a number of implementations: \n\nambi_sieve_plain,\nrwh_primes, \nrwh_primes1, \nrwh_primes2, \nsieveOfAtkin, \nsieveOfEratosthenes, \nsundaram3,\nsieve_wheel_30,\nambi_sieve (requires numpy)\nprimesfrom3to (requires numpy)\nprimesfrom2to (requires numpy)\n\nMany thanks to stephan for bringing sieve_wheel_30 to my attention.\nCredit goes to Robert William Hanks for primesfrom2to, primesfrom3to, rwh_primes, rwh_primes1, and rwh_primes2.\nOf the plain Python methods tested, with psyco, for n=1000000,\nrwh_primes1 was the fastest tested.\n+---------------------+-------+\n| Method | ms |\n+---------------------+-------+\n| rwh_primes1 | 43.0 |\n| sieveOfAtkin | 46.4 |\n| rwh_primes | 57.4 |\n| sieve_wheel_30 | 63.0 |\n| rwh_primes2 | 67.8 | \n| sieveOfEratosthenes | 147.0 |\n| ambi_sieve_plain | 152.0 |\n| sundaram3 | 194.0 |\n+---------------------+-------+\n\nOf the plain Python methods tested, without psyco, for n=1000000,\nrwh_primes2 was the fastest.\n+---------------------+-------+\n| Method | ms |\n+---------------------+-------+\n| rwh_primes2 | 68.1 |\n| rwh_primes1 | 93.7 |\n| rwh_primes | 94.6 |\n| sieve_wheel_30 | 97.4 |\n| sieveOfEratosthenes | 178.0 |\n| ambi_sieve_plain | 286.0 |\n| sieveOfAtkin | 314.0 |\n| sundaram3 | 416.0 |\n+---------------------+-------+\n\nOf all the methods tested, allowing numpy, for n=1000000,\nprimesfrom2to was the fastest tested.\n+---------------------+-------+\n| Method | ms |\n+---------------------+-------+\n| primesfrom2to | 15.9 |\n| primesfrom3to | 18.4 |\n| ambi_sieve | 29.3 |\n+---------------------+-------+\n\nTimings were measured using the command:\npython -mtimeit -s\"import primes\" \"primes.{method}(1000000)\"\n\nwith {method} replaced by each of the method names.\nprimes.py:\n#!/usr/bin/env python\nimport psyco; psyco.full()\nfrom math import sqrt, ceil\nimport numpy as np\n\ndef rwh_primes(n):\n # http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188\n \"\"\" Returns a list of primes < n \"\"\"\n sieve = [True] * n\n for i in xrange(3,int(n**0.5)+1,2):\n if sieve[i]:\n sieve[i*i::2*i]=[False]*((n-i*i-1)/(2*i)+1)\n return [2] + [i for i in xrange(3,n,2) if sieve[i]]\n\ndef rwh_primes1(n):\n # http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188\n \"\"\" Returns a list of primes < n \"\"\"\n sieve = [True] * (n/2)\n for i in xrange(3,int(n**0.5)+1,2):\n if sieve[i/2]:\n sieve[i*i/2::i] = [False] * ((n-i*i-1)/(2*i)+1)\n return [2] + [2*i+1 for i in xrange(1,n/2) if sieve[i]]\n\ndef rwh_primes2(n):\n # http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188\n \"\"\" Input n>=6, Returns a list of primes, 2 <= p < n \"\"\"\n correction = (n%6>1)\n n = {0:n,1:n-1,2:n+4,3:n+3,4:n+2,5:n+1}[n%6]\n sieve = [True] * (n/3)\n sieve[0] = False\n for i in xrange(int(n**0.5)/3+1):\n if sieve[i]:\n k=3*i+1|1\n sieve[ ((k*k)/3) ::2*k]=[False]*((n/6-(k*k)/6-1)/k+1)\n sieve[(k*k+4*k-2*k*(i&1))/3::2*k]=[False]*((n/6-(k*k+4*k-2*k*(i&1))/6-1)/k+1)\n return [2,3] + [3*i+1|1 for i in xrange(1,n/3-correction) if sieve[i]]\n\ndef sieve_wheel_30(N):\n # http://zerovolt.com/?p=88\n ''' Returns a list of primes <= N using wheel criterion 2*3*5 = 30\n\nCopyright 2009 by zerovolt.com\nThis code is free for non-commercial purposes, in which case you can just leave this comment as a credit for my work.\nIf you need this code for commercial purposes, please contact me by sending an email to: info [at] zerovolt [dot] com.'''\n __smallp = ( 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59,\n 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139,\n 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227,\n 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311,\n 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401,\n 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491,\n 499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599,\n 601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683,\n 691, 701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797,\n 809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887,\n 907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997)\n\n wheel = (2, 3, 5)\n const = 30\n if N < 2:\n return []\n if N <= const:\n pos = 0\n while __smallp[pos] <= N:\n pos += 1\n return list(__smallp[:pos])\n # make the offsets list\n offsets = (7, 11, 13, 17, 19, 23, 29, 1)\n # prepare the list\n p = [2, 3, 5]\n dim = 2 + N // const\n tk1 = [True] * dim\n tk7 = [True] * dim\n tk11 = [True] * dim\n tk13 = [True] * dim\n tk17 = [True] * dim\n tk19 = [True] * dim\n tk23 = [True] * dim\n tk29 = [True] * dim\n tk1[0] = False\n # help dictionary d\n # d[a , b] = c ==> if I want to find the smallest useful multiple of (30*pos)+a\n # on tkc, then I need the index given by the product of [(30*pos)+a][(30*pos)+b]\n # in general. If b < a, I need [(30*pos)+a][(30*(pos+1))+b]\n d = {}\n for x in offsets:\n for y in offsets:\n res = (x*y) % const\n if res in offsets:\n d[(x, res)] = y\n # another help dictionary: gives tkx calling tmptk[x]\n tmptk = {1:tk1, 7:tk7, 11:tk11, 13:tk13, 17:tk17, 19:tk19, 23:tk23, 29:tk29}\n pos, prime, lastadded, stop = 0, 0, 0, int(ceil(sqrt(N)))\n # inner functions definition\n def del_mult(tk, start, step):\n for k in xrange(start, len(tk), step):\n tk[k] = False\n # end of inner functions definition\n cpos = const * pos\n while prime < stop:\n # 30k + 7\n if tk7[pos]:\n prime = cpos + 7\n p.append(prime)\n lastadded = 7\n for off in offsets:\n tmp = d[(7, off)]\n start = (pos + prime) if off == 7 else (prime * (const * (pos + 1 if tmp < 7 else 0) + tmp) )//const\n del_mult(tmptk[off], start, prime)\n # 30k + 11\n if tk11[pos]:\n prime = cpos + 11\n p.append(prime)\n lastadded = 11\n for off in offsets:\n tmp = d[(11, off)]\n start = (pos + prime) if off == 11 else (prime * (const * (pos + 1 if tmp < 11 else 0) + tmp) )//const\n del_mult(tmptk[off], start, prime)\n # 30k + 13\n if tk13[pos]:\n prime = cpos + 13\n p.append(prime)\n lastadded = 13\n for off in offsets:\n tmp = d[(13, off)]\n start = (pos + prime) if off == 13 else (prime * (const * (pos + 1 if tmp < 13 else 0) + tmp) )//const\n del_mult(tmptk[off], start, prime)\n # 30k + 17\n if tk17[pos]:\n prime = cpos + 17\n p.append(prime)\n lastadded = 17\n for off in offsets:\n tmp = d[(17, off)]\n start = (pos + prime) if off == 17 else (prime * (const * (pos + 1 if tmp < 17 else 0) + tmp) )//const\n del_mult(tmptk[off], start, prime)\n # 30k + 19\n if tk19[pos]:\n prime = cpos + 19\n p.append(prime)\n lastadded = 19\n for off in offsets:\n tmp = d[(19, off)]\n start = (pos + prime) if off == 19 else (prime * (const * (pos + 1 if tmp < 19 else 0) + tmp) )//const\n del_mult(tmptk[off], start, prime)\n # 30k + 23\n if tk23[pos]:\n prime = cpos + 23\n p.append(prime)\n lastadded = 23\n for off in offsets:\n tmp = d[(23, off)]\n start = (pos + prime) if off == 23 else (prime * (const * (pos + 1 if tmp < 23 else 0) + tmp) )//const\n del_mult(tmptk[off], start, prime)\n # 30k + 29\n if tk29[pos]:\n prime = cpos + 29\n p.append(prime)\n lastadded = 29\n for off in offsets:\n tmp = d[(29, off)]\n start = (pos + prime) if off == 29 else (prime * (const * (pos + 1 if tmp < 29 else 0) + tmp) )//const\n del_mult(tmptk[off], start, prime)\n # now we go back to top tk1, so we need to increase pos by 1\n pos += 1\n cpos = const * pos\n # 30k + 1\n if tk1[pos]:\n prime = cpos + 1\n p.append(prime)\n lastadded = 1\n for off in offsets:\n tmp = d[(1, off)]\n start = (pos + prime) if off == 1 else (prime * (const * pos + tmp) )//const\n del_mult(tmptk[off], start, prime)\n # time to add remaining primes\n # if lastadded == 1, remove last element and start adding them from tk1\n # this way we don't need an \"if\" within the last while\n if lastadded == 1:\n p.pop()\n # now complete for every other possible prime\n while pos < len(tk1):\n cpos = const * pos\n if tk1[pos]: p.append(cpos + 1)\n if tk7[pos]: p.append(cpos + 7)\n if tk11[pos]: p.append(cpos + 11)\n if tk13[pos]: p.append(cpos + 13)\n if tk17[pos]: p.append(cpos + 17)\n if tk19[pos]: p.append(cpos + 19)\n if tk23[pos]: p.append(cpos + 23)\n if tk29[pos]: p.append(cpos + 29)\n pos += 1\n # remove exceeding if present\n pos = len(p) - 1\n while p[pos] > N:\n pos -= 1\n if pos < len(p) - 1:\n del p[pos+1:]\n # return p list\n return p\n\ndef sieveOfEratosthenes(n):\n \"\"\"sieveOfEratosthenes(n): return the list of the primes < n.\"\"\"\n # Code from: , Nov 30 2006\n # http://groups.google.com/group/comp.lang.python/msg/f1f10ced88c68c2d\n if n <= 2:\n return []\n sieve = range(3, n, 2)\n top = len(sieve)\n for si in sieve:\n if si:\n bottom = (si*si - 3) // 2\n if bottom >= top:\n break\n sieve[bottom::si] = [0] * -((bottom - top) // si)\n return [2] + [el for el in sieve if el]\n\ndef sieveOfAtkin(end):\n \"\"\"sieveOfAtkin(end): return a list of all the prime numbers , improved\n # Code: https://web.archive.org/web/20080324064651/http://krenzel.info/?p=83\n # Info: http://en.wikipedia.org/wiki/Sieve_of_Atkin\n assert end > 0\n lng = ((end-1) // 2)\n sieve = [False] * (lng + 1)\n\n x_max, x2, xd = int(sqrt((end-1)/4.0)), 0, 4\n for xd in xrange(4, 8*x_max + 2, 8):\n x2 += xd\n y_max = int(sqrt(end-x2))\n n, n_diff = x2 + y_max*y_max, (y_max << 1) - 1\n if not (n & 1):\n n -= n_diff\n n_diff -= 2\n for d in xrange((n_diff - 1) << 1, -1, -8):\n m = n % 12\n if m == 1 or m == 5:\n m = n >> 1\n sieve[m] = not sieve[m]\n n -= d\n\n x_max, x2, xd = int(sqrt((end-1) / 3.0)), 0, 3\n for xd in xrange(3, 6 * x_max + 2, 6):\n x2 += xd\n y_max = int(sqrt(end-x2))\n n, n_diff = x2 + y_max*y_max, (y_max << 1) - 1\n if not(n & 1):\n n -= n_diff\n n_diff -= 2\n for d in xrange((n_diff - 1) << 1, -1, -8):\n if n % 12 == 7:\n m = n >> 1\n sieve[m] = not sieve[m]\n n -= d\n\n x_max, y_min, x2, xd = int((2 + sqrt(4-8*(1-end)))/4), -1, 0, 3\n for x in xrange(1, x_max + 1):\n x2 += xd\n xd += 6\n if x2 >= end: y_min = (((int(ceil(sqrt(x2 - end))) - 1) << 1) - 2) << 1\n n, n_diff = ((x*x + x) << 1) - 1, (((x-1) << 1) - 2) << 1\n for d in xrange(n_diff, y_min, -8):\n if n % 12 == 11:\n m = n >> 1\n sieve[m] = not sieve[m]\n n += d\n\n primes = [2, 3]\n if end <= 3:\n return primes[:max(0,end-2)]\n\n for n in xrange(5 >> 1, (int(sqrt(end))+1) >> 1):\n if sieve[n]:\n primes.append((n << 1) + 1)\n aux = (n << 1) + 1\n aux *= aux\n for k in xrange(aux, end, 2 * aux):\n sieve[k >> 1] = False\n\n s = int(sqrt(end)) + 1\n if s % 2 == 0:\n s += 1\n primes.extend([i for i in xrange(s, end, 2) if sieve[i >> 1]])\n\n return primes\n\ndef ambi_sieve_plain(n):\n s = range(3, n, 2)\n for m in xrange(3, int(n**0.5)+1, 2): \n if s[(m-3)/2]: \n for t in xrange((m*m-3)/2,(n>>1)-1,m):\n s[t]=0\n return [2]+[t for t in s if t>0]\n\ndef sundaram3(max_n):\n # http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/2073279#2073279\n numbers = range(3, max_n+1, 2)\n half = (max_n)//2\n initial = 4\n\n for step in xrange(3, max_n+1, 2):\n for i in xrange(initial, half, step):\n numbers[i-1] = 0\n initial += 2*(step+1)\n\n if initial > half:\n return [2] + filter(None, numbers)\n\n################################################################################\n# Using Numpy:\ndef ambi_sieve(n):\n # http://tommih.blogspot.com/2009/04/fast-prime-number-generator.html\n s = np.arange(3, n, 2)\n for m in xrange(3, int(n ** 0.5)+1, 2): \n if s[(m-3)/2]: \n s[(m*m-3)/2::m]=0\n return np.r_[2, s[s>0]]\n\ndef primesfrom3to(n):\n # http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188\n \"\"\" Returns a array of primes, p < n \"\"\"\n assert n>=2\n sieve = np.ones(n/2, dtype=np.bool)\n for i in xrange(3,int(n**0.5)+1,2):\n if sieve[i/2]:\n sieve[i*i/2::i] = False\n return np.r_[2, 2*np.nonzero(sieve)[0][1::]+1] \n\ndef primesfrom2to(n):\n # http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188\n \"\"\" Input n>=6, Returns a array of primes, 2 <= p < n \"\"\"\n sieve = np.ones(n/3 + (n%6==2), dtype=np.bool)\n sieve[0] = False\n for i in xrange(int(n**0.5)/3+1):\n if sieve[i]:\n k=3*i+1|1\n sieve[ ((k*k)/3) ::2*k] = False\n sieve[(k*k+4*k-2*k*(i&1))/3::2*k] = False\n return np.r_[2,3,((3*np.nonzero(sieve)[0]+1)|1)]\n\nif __name__=='__main__':\n import itertools\n import sys\n\n def test(f1,f2,num):\n print('Testing {f1} and {f2} return same results'.format(\n f1=f1.func_name,\n f2=f2.func_name))\n if not all([a==b for a,b in itertools.izip_longest(f1(num),f2(num))]):\n sys.exit(\"Error: %s(%s) != %s(%s)\"%(f1.func_name,num,f2.func_name,num))\n\n n=1000000\n test(sieveOfAtkin,sieveOfEratosthenes,n)\n test(sieveOfAtkin,ambi_sieve,n)\n test(sieveOfAtkin,ambi_sieve_plain,n) \n test(sieveOfAtkin,sundaram3,n)\n test(sieveOfAtkin,sieve_wheel_30,n)\n test(sieveOfAtkin,primesfrom3to,n)\n test(sieveOfAtkin,primesfrom2to,n)\n test(sieveOfAtkin,rwh_primes,n)\n test(sieveOfAtkin,rwh_primes1,n) \n test(sieveOfAtkin,rwh_primes2,n)\n\nRunning the script tests that all implementations give the same result.\n"} {"input": "How do you express binary literals in Python? How do you express an integer as a binary number with Python literals?\nI was easily able to find the answer for hex:\n >>> 0x12AF\n 4783\n >>> 0x100\n 256\n\nand octal:\n >>> 01267\n 695\n >>> 0100\n 64\n\nHow do you use literals to express binary in Python?\n\nSummary of Answers\n\nPython 2.5 and earlier: can express binary using int('01010101111',2) but not with a literal.\nPython 2.5 and earlier: there is no way to express binary literals.\nPython 2.6 beta: You can do like so: 0b1100111 or 0B1100111.\nPython 2.6 beta: will also allow 0o27 or 0O27 (second character is the letter O) to represent an octal.\nPython 3.0 beta: Same as 2.6, but will no longer allow the older 027 syntax for octals.\n\n", "output": "For reference\u2014future Python possibilities:\nStarting with Python 2.6 you can express binary literals using the prefix 0b or 0B:\n>>> 0b101111\n47\n\nYou can also use the new bin function to get the binary representation of a number:\n>>> bin(173)\n'0b10101101'\n\nDevelopment version of the documentation: What's New in Python 2.6\n"} {"input": "Showing the stack trace from a running Python application I have this Python application that gets stuck from time to time and I can't find out where.\nIs there any way to signal Python interpreter to show you the exact code that's running? \nSome kind of on-the-fly stacktrace?\nRelated questions:\n\nPrint current call stack from a method in Python code\nCheck what a running process is doing: print stack trace of an uninstrumented Python program\n\n", "output": "I have module I use for situations like this - where a process will be running for a long time but gets stuck sometimes for unknown and irreproducible reasons. Its a bit hacky, and only works on unix (requires signals):\nimport code, traceback, signal\n\ndef debug(sig, frame):\n \"\"\"Interrupt running process, and provide a python prompt for\n interactive debugging.\"\"\"\n d={'_frame':frame} # Allow access to frame object.\n d.update(frame.f_globals) # Unless shadowed by global\n d.update(frame.f_locals)\n\n i = code.InteractiveConsole(d)\n message = \"Signal received : entering python shell.\\nTraceback:\\n\"\n message += ''.join(traceback.format_stack(frame))\n i.interact(message)\n\ndef listen():\n signal.signal(signal.SIGUSR1, debug) # Register handler\n\nTo use, just call the listen() function at some point when your program starts up (You could even stick it in site.py to have all python programs use it), and let it run. At any point, send the process a SIGUSR1 signal, using kill, or in python:\n os.kill(pid, signal.SIGUSR1)\n\nThis will cause the program to break to a python console at the point it is currently at, showing you the stack trace, and letting you manipulate the variables. Use control-d (EOF) to continue running (though note that you will probably interrupt any I/O etc at the point you signal, so it isn't fully non-intrusive.\nI've another script that does the same thing, except it communicates with the running process through a pipe (to allow for debugging backgrounded processes etc). Its a bit large to post here, but I've added it as a python cookbook recipe.\n"} {"input": "Creating a JSON response using Django and Python I'm trying to convert a server side Ajax response script into a Django HttpResponse, but apparently it's not working.\nThis is the server-side script:\n/* RECEIVE VALUE */\n$validateValue=$_POST['validateValue'];\n$validateId=$_POST['validateId'];\n$validateError=$_POST['validateError'];\n\n/* RETURN VALUE */\n$arrayToJs = array();\n$arrayToJs[0] = $validateId;\n$arrayToJs[1] = $validateError;\n\nif($validateValue ==\"Testuser\"){ // Validate??\n $arrayToJs[2] = \"true\"; // RETURN TRUE\n echo '{\"jsonValidateReturn\":'.json_encode($arrayToJs).'}'; // RETURN ARRAY WITH success\n}\nelse{\n for($x=0;$x<1000000;$x++){\n if($x == 990000){\n $arrayToJs[2] = \"false\";\n echo '{\"jsonValidateReturn\":'.json_encode($arrayToJs).'}'; // RETURNS ARRAY WITH ERROR.\n }\n }\n}\n\nAnd this is the converted code\ndef validate_user(request):\n if request.method == 'POST':\n vld_value = request.POST.get('validateValue')\n vld_id = request.POST.get('validateId')\n vld_error = request.POST.get('validateError')\n\n array_to_js = [vld_id, vld_error, False]\n\n if vld_value == \"TestUser\":\n array_to_js[2] = True\n x = simplejson.dumps(array_to_js)\n return HttpResponse(x)\n else:\n array_to_js[2] = False\n x = simplejson.dumps(array_to_js)\n error = 'Error'\n return render_to_response('index.html',{'error':error},context_instance=RequestContext(request))\n return render_to_response('index.html',context_instance=RequestContext(request))\n\nI'm using simplejson to encode the Python list (so it will return a JSON array). I couldn't figure out the problem yet. But I think that I did something wrong about the 'echo'.\n", "output": "I usually use a dictionary, not a list to return JSON content. \nimport json\n\nfrom django.http import HttpResponse\n\nresponse_data = {}\nresponse_data['result'] = 'error'\nresponse_data['message'] = 'Some error message'\n\nPre-Django 1.7 you'd return it like this:\nreturn HttpResponse(json.dumps(response_data), content_type=\"application/json\")\n\nFor Django 1.7+, use JsonResponse as shown in this SO answer like so : \nfrom django.http import JsonResponse\nreturn JsonResponse({'foo':'bar'})\n\n"} {"input": "How do I correctly clean up a Python object? class Package:\n def __init__(self):\n self.files = []\n\n # ...\n\n def __del__(self):\n for file in self.files:\n os.unlink(file)\n\n__del__(self) above fails with an AttributeError exception. I understand Python doesn't guarantee the existence of \"global variables\" (member data in this context?) when __del__() is invoked. If that is the case and this is the reason for the exception, how do I make sure the object destructs properly?\n", "output": "I'd recommend using Python's with statement for managing resources that need to be cleaned up. The problem with using an explicit close() statement is that you have to worry about people forgetting to call it at all or forgetting to place it in a finally block to prevent a resource leak when an exception occurs.\nTo use the with statement, create a class with the following methods:\n def __enter__(self)\n def __exit__(self, exc_type, exc_value, traceback)\n\nIn your example above, you'd use \nclass Package:\n def __init__(self):\n self.files = []\n\n def __enter__(self):\n return self\n\n # ...\n\n def __exit__(self, exc_type, exc_value, traceback):\n for file in self.files:\n os.unlink(file)\n\nThen, when someone wanted to use your class, they'd do the following:\nwith Package() as package_obj:\n # use package_obj\n\nThe variable package_obj will be an instance of type Package (it's the value returned by the __enter__ method). Its __exit__ method will automatically be called, regardless of whether or not an exception occurs.\nYou could even take this approach a step further. In the example above, someone could still instantiate Package using its constructor without using the with clause. You don't want that to happen. You can fix this by creating a PackageResource class that defines the __enter__ and __exit__ methods. Then, the Package class would be defined strictly inside the __enter__ method and returned. That way, the caller never could instantiate the Package class without using a with statement:\nclass PackageResource:\n def __enter__(self):\n class Package:\n ...\n self.package_obj = Package()\n return self.package_obj\n\n def __exit__(self, exc_type, exc_value, traceback):\n self.package_obj.cleanup()\n\nYou'd use this as follows:\nwith PackageResource() as package_obj:\n # use package_obj\n\n"} {"input": "Proper way to use **kwargs in Python What is the proper way to use **kwargs in Python when it comes to default values?\nkwargs returns a dictionary, but what is the best way to set default values, or is there one? Should I just access it as a dictionary? Use get function? \nclass ExampleClass:\n def __init__(self, **kwargs):\n self.val = kwargs['val']\n self.val2 = kwargs.get('val2')\n\nA simple question, but one that I can't find good resources on. People do it different ways in code that I've seen and it's hard to know what to use.\n", "output": "You can pass a default value to get() for keys that are not in the dictionary:\nself.val2 = kwargs.get('val2',\"default value\")\n\nHowever, if you plan on using a particular argument with a particular default value, why not use named arguments in the first place?\ndef __init__(self, val2=\"default value\", **kwargs):\n\n"} {"input": "No module named pkg_resources I'm deploying a Django app to a dev server and am hitting this error when i run pip install requirements.txt:\nTraceback (most recent call last):\n File \"/var/www/mydir/virtualenvs/dev/bin/pip\", line 5, in \n from pkg_resources import load_entry_point\nImportError: No module named pkg_resources\n\npkg_resources appears to be distributed with setuptools. Initially I thought this might not be installed to the python in the virtualenv so I installed setuptools (2.6, same version as Python) to the Python site-packages in the virtualenv with the following command\nsh setuptools-0.6c11-py2.6.egg --install-dir /var/www/mydir/virtualenvs/dev/lib/python2.6/site-packages\n\nEDIT\nThis only happens inside the virtualenv. If I open a console outside the virtualenv then pkg_resources is present\nbut I am still getting the same error. Any ideas as to why pkg_resources is not on the path?\n", "output": "I encountered the same ImportError today while trying to use pip. Somehow the setuptools package had been deleted in my Python environment.\nTo fix the issue, run the setup script for setuptools:\nwget https://bootstrap.pypa.io/ez_setup.py -O - | python\n\n(or if you don't have wget installed (e.g. OS X), try\ncurl https://bootstrap.pypa.io/ez_setup.py | python\n\npossibly with sudo prepended.)\nIf you have any version of distribute, or any setuptools below 0.6, you will have to uninstall it first.*\nSee Installation Instructions for further details.\n\n* If you already have a working distribute, upgrading it to the \"compatibility wrapper\" that switches you over to setuptools is easier. But if things are already broken, don't try that.\n"} {"input": "Rename Files in Python I'm trying to rename some files in a directory using Python.\nSay I have a file called CHEESE_CHEESE_TYPE.*** and want to remove CHEESE_ so my resulting filename would be CHEESE_TYPE\nI'm trying to use the os.path.split but it's not working properly. I have also considered using string manipulations, but have not been successful with that either.\n", "output": "Do you want something like this?\n$ ls\ncheese_cheese_type.bar cheese_cheese_type.foo\n$ python\n>>> import os\n>>> for filename in os.listdir(\".\"):\n... if filename.startswith(\"cheese_\"):\n... os.rename(filename, filename[7:])\n... \n>>> \n$ ls\ncheese_type.bar cheese_type.foo\n\n"} {"input": "Find which version of package is installed with pip Using pip, is it possible to figure out which version of a package is currently installed?\nI know about pip install XYZ --upgrade but I am wondering if there is anything like pip info XYZ. If not what would be the best way to tell what version I am currently using.\n", "output": "As of pip 1.3, there is a pip show command.\n$ pip show Jinja2\n---\nName: Jinja2\nVersion: 2.7.3\nLocation: /path/to/virtualenv/lib/python2.7/site-packages\nRequires: markupsafe\n\nIn older versions, pip freeze and grep should do the job nicely.\n$ pip freeze | grep Jinja2\nJinja2==2.7.3\n\n"} {"input": "How to create a GUID/UUID in Python How do I create a GUID in Python that is platform independent? I hear there is a method using ActivePython on Windows but it's Windows only because it uses COM. Is there a method using plain Python?\n", "output": "If you're using Python 2.5 or later, the uuid module is already included with the Python standard distribution.\nEx: \n>>> import uuid\n>>> uuid.uuid4()\nUUID('5361a11b-615c-42bf-9bdb-e2c3790ada14')\n\n"} {"input": "IndentationError: unindent does not match any outer indentation level When I compile the Python code below, I get \n\nIndentationError: unindent does not match any outer indentation level\n\n\nimport sys\n\ndef Factorial(n): # Return factorial\n result = 0\n for i in range (1,n):\n result = result * i\n print \"factorial is \",result\n return result\n\nWhy?\n", "output": "EDIT: Other posters are probably correct...there might be spaces mixed in with your tabs. Try doing a search&replace to replace all tabs with a few spaces.\nTry this:\nimport sys\n\ndef Factorial(n): # return factorial\n result = 1\n for i in range (1,n):\n result = result * i\n print \"factorial is \",result\n return result\n\nprint Factorial(10)\n\n"} {"input": "A Transpose/Unzip Function in Python (inverse of zip) I have a list of 2-item tuples and I'd like to convert them to 2 lists where the first contains the first item in each tuple and the second list holds the second item.\nFor example:\noriginal = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]\n# and I want to become...\nresult = (['a', 'b', 'c', 'd'], [1, 2, 3, 4])\n\nIs there a builtin function that does that?\n", "output": "zip is its own inverse! Provided you use the special * operator.\n>>> zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4)])\n[('a', 'b', 'c', 'd'), (1, 2, 3, 4)]\n\nThe way this works is by calling zip with the arguments:\nzip(('a', 1), ('b', 2), ('c', 3), ('d', 4))\n\n\u00e2\u0080\u00a6 except the arguments are passed to zip directly (after being converted to a tuple), so there's no need to worry about the number of arguments getting too big.\n"} {"input": "What is the most \"pythonic\" way to iterate over a list in chunks? I have a Python script which takes as input a list of integers, which I need to work with four integers at a time. Unfortunately, I don't have control of the input, or I'd have it passed in as a list of four-element tuples. Currently, I'm iterating over it this way:\nfor i in xrange(0, len(ints), 4):\n # dummy op for example code\n foo += ints[i] * ints[i + 1] + ints[i + 2] * ints[i + 3]\n\nIt looks a lot like \"C-think\", though, which makes me suspect there's a more pythonic way of dealing with this situation. The list is discarded after iterating, so it needn't be preserved. Perhaps something like this would be better?\nwhile ints:\n foo += ints[0] * ints[1] + ints[2] * ints[3]\n ints[0:4] = []\n\nStill doesn't quite \"feel\" right, though. :-/\nRelated question: How do you split a list into evenly sized chunks in Python?\n", "output": "def chunker(seq, size):\n return (seq[pos:pos + size] for pos in xrange(0, len(seq), size))\n\nSimple. Easy. Fast. Works with any sequence:\ntext = \"I am a very, very helpful text\"\n\nfor group in chunker(text, 7):\n print repr(group),\n# 'I am a ' 'very, v' 'ery hel' 'pful te' 'xt'\n\nprint '|'.join(chunker(text, 10))\n# I am a ver|y, very he|lpful text\n\nanimals = ['cat', 'dog', 'rabbit', 'duck', 'bird', 'cow', 'gnu', 'fish']\n\nfor group in chunker(animals, 3):\n print group\n# ['cat', 'dog', 'rabbit']\n# ['duck', 'bird', 'cow']\n# ['gnu', 'fish']\n\n"} {"input": "What exactly do \"u\" and \"r\" string flags do in Python, and what are raw string literals? While asking this question, I realized I didn't know much about raw strings. For somebody claiming to be a Django trainer, this sucks.\nI know what an encoding is, and I know what u alone does since I get what is Unicode.\nBut what does r do exactly? What kind of string does it result in?\nAnd above all, what the heck does ur do?\nFinally, is there any reliable way to go back from a Unicode string to a simple raw string?\nAh, and by the way, if your system and your text editor charset are set to UTF-8, does u actually do anything?\n", "output": "There's not really any \"raw string\"; there are raw string literals, which are exactly the string literals marked by a 'r' before the opening quote.\nA \"raw string literal\" is a slightly different syntax for a string literal, in which a backslash, \\, is taken as meaning \"just a backslash\" (except when it comes right before a quote that would otherwise terminate the literal) -- no \"escape sequences\" to represent newlines, tabs, backspaces, form-feeds, and so on. In normal string literals, each backslash must be doubled up to avoid being taken as the start of an escape sequence.\nThis syntax variant exists mostly because the syntax of regular expression patterns is heavy with backslashes (but never at the end, so the \"except\" clause above doesn't matter) and it looks a bit better when you avoid doubling up each of them -- that's all. It also gained some popularity to express native Windows file paths (with backslashes instead of regular slashes like on other platforms), but that's very rarely needed (since normal slashes mostly work fine on Windows too) and imperfect (due to the \"except\" clause above).\nr'...' is a byte string (in Python 2.*), ur'...' is a Unicode string (again, in Python 2.*), and any of the other three kinds of quoting also produces exactly the same types of strings (so for example r'...', r'''...''', r\"...\", r\"\"\"...\"\"\" are all byte strings, and so on).\nNot sure what you mean by \"going back\" - there is no intrinsically back and forward directions, because there's no raw string type, it's just an alternative syntax to express perfectly normal string objects, byte or unicode as they may be.\nAnd yes, in Python 2.*, u'...' is of course always distinct from just '...' -- the former is a unicode string, the latter is a byte string. What encoding the literal might be expressed in is a completely orthogonal issue.\nE.g., consider (Python 2.6):\n>>> sys.getsizeof('ciao')\n28\n>>> sys.getsizeof(u'ciao')\n34\n\nthe Unicode object of course takes more memory space (very small difference for a very short string, obviously;-).\n"} {"input": "How to save a Python interactive session? I find myself frequently using Python's interpreter to work with databases, files, etc -- basically a lot of manual formatting of semi-structured data. I don't properly save and clean up the useful bits as often as I would like. Is there a way to save my input into the shell (db connections, variable assignments, little for loops and bits of logic) -- some history of the interactive session? If I use something like script I get too much stdout noise. I don't really need to pickle all the objects -- though if there is a solution that does that, it would be OK. Ideally I would just be left with a script that ran as the one I created interactively, and I could just delete the bits I didn't need. Is there a package that does this, or a DIY approach?\nUPDATE: I am really amazed at the quality and usefulness of these packages. For those with a similar itch:\n\nIPython -- should have been using this for ages, kind of what I had in mind\nreinteract -- very impressive, I want to learn more about visualization and this seems like it will shine there. Sort of a gtk/gnome desktop app that renders graphs inline. Imagine a hybrid shell + graphing calculator + mini eclipse. Source distribution here: http://www.reinteract.org/trac/wiki/GettingIt . Built fine on Ubuntu, integrates into gnome desktop, Windows and Mac installers too.\nbpython -- extremely cool, lots of nice features, autocomplete(!), rewind, one keystroke save to file, indentation, well done. Python source distribution, pulled a couple of dependencies from sourceforge.\n\nI am converted, these really fill a need between interpreter and editor.\n", "output": "IPython is extremely useful if you like using interactive sessions. For example for your usecase there is the %save magic command, you just input %save my_useful_session 10-20 23 to save input lines 10 to 20 and 23 to my_useful_session.py. (to help with this, every line is prefixed by its number)\nLook at the videos on the documentation page to get a quick overview of the features.\n"} {"input": "Python: about catching ANY exception How can I write a try/except block that catches all exceptions?\n", "output": "Apart from a bare except: clause (which as others have said you shouldn't use), you can simply catch Exception:\nimport traceback\nimport logging\n\ntry:\n whatever()\nexcept Exception as e:\n logging.error(traceback.format_exc())\n # Logs the error appropriately. \n\nYou would normally only ever consider doing this at the outermost level of your code if for example you wanted to handle any otherwise uncaught exceptions before terminating.\nThe advantage of except Exception over the bare except is that there are a few exceptions that it wont catch, most obviously KeyboardInterrupt and SystemExit: if you caught and swallowed those then you could make it hard for anyone to exit your script.\n"} {"input": "Check if a Python list item contains a string inside another string I have a list:\nmy_list = ['abc-123', 'def-456', 'ghi-789', 'abc-456']\n\nand want to search for items that contain the string 'abc'. How can I do that?\nif 'abc' in my_list:\n\nwould check if 'abc' exists in the list but it is a part of 'abc-123' and 'abc-456', 'abc' does not exist on its own. So how can I get all items that contain 'abc' ?\n", "output": "If you only want to check for the presence of abc in any string in the list, you could try\nsome_list = ['abc-123', 'def-456', 'ghi-789', 'abc-456']\nif any(\"abc\" in s for s in some_list):\n # whatever\n\nIf you really want to get all the items containing abc, use\nmatching = [s for s in some_list if \"abc\" in s]\n\n"} {"input": "Elegant Python function to convert CamelCase to snake_case? Example:\n>>> convert('CamelCase')\n'camel_case'\n\n", "output": "This is pretty thorough:\ndef convert(name):\n s1 = re.sub('(.)([A-Z][a-z]+)', r'\\1_\\2', name)\n return re.sub('([a-z0-9])([A-Z])', r'\\1_\\2', s1).lower()\n\nWorks with all these (and doesn't harm already-un-cameled versions):\n>>> convert('CamelCase')\n'camel_case'\n>>> convert('CamelCamelCase')\n'camel_camel_case'\n>>> convert('Camel2Camel2Case')\n'camel2_camel2_case'\n>>> convert('getHTTPResponseCode')\n'get_http_response_code'\n>>> convert('get2HTTPResponseCode')\n'get2_http_response_code'\n>>> convert('HTTPResponseCode')\n'http_response_code'\n>>> convert('HTTPResponseCodeXYZ')\n'http_response_code_xyz'\n\nOr if you're going to call it a zillion times, you can pre-compile the regexes:\nfirst_cap_re = re.compile('(.)([A-Z][a-z]+)')\nall_cap_re = re.compile('([a-z0-9])([A-Z])')\ndef convert(name):\n s1 = first_cap_re.sub(r'\\1_\\2', name)\n return all_cap_re.sub(r'\\1_\\2', s1).lower()\n\nDon't forget to import the regular expression module\nimport re\n\n"} {"input": "Maximum and Minimum values for ints I am looking for minimum and maximum values for integers in python. For eg., in Java, we have Integer.MIN_VALUE and Integer.MAX_VALUE. Is there something like this in python?\n", "output": "In Python 2, there's sys.maxint:\n>>> sys.maxint\n9223372036854775807\n\nAnd you can calculate the minimum value with -sys.maxint - 1 as seen here. Of course Python just switches from plain to long integers once you exceed this value.\nIn Python 3 this value has no significance at all for integers. (However, a closely related value is still available as sys.maxsize, in case you're actually looking for the machine's word size.)\n"} {"input": "Is it worth using Python's re.compile? Is there any benefit in using compile for regular expressions in Python?\nh = re.compile('hello')\nh.match('hello world')\n\nvs\nre.match('hello', 'hello world')\n\n", "output": "I've had a lot of experience running a compiled regex 1000s of times versus compiling on-the-fly, and have not noticed any perceivable difference. Obviously, this is anecdotal, and certainly not a great argument against compiling, but I've found the difference to be negligible.\nEDIT:\nAfter a quick glance at the actual Python 2.5 library code, I see that Python internally compiles AND CACHES regexes whenever you use them anyway (including calls to re.match()), so you're really only changing WHEN the regex gets compiled, and shouldn't be saving much time at all - only the time it takes to check the cache (a key lookup on an internal dict type).\nFrom module re.py (comments are mine):\ndef match(pattern, string, flags=0):\n return _compile(pattern, flags).match(string)\n\ndef _compile(*key):\n\n # Does cache check at top of function\n cachekey = (type(key[0]),) + key\n p = _cache.get(cachekey)\n if p is not None: return p\n\n # ...\n # Does actual compilation on cache miss\n # ...\n\n # Caches compiled regex\n if len(_cache) >= _MAXCACHE:\n _cache.clear()\n _cache[cachekey] = p\n return p\n\nI still often pre-compile regular expressions, but only to bind them to a nice, reusable name, not for any expected performance gain.\n"} {"input": "Why is 'x' in ('x',) faster than 'x' == 'x'? >>> timeit.timeit(\"'x' in ('x',)\")\n0.04869917374131205\n>>> timeit.timeit(\"'x' == 'x'\")\n0.06144205736110564\n\nAlso works for tuples with multiple elements, both versions seem to grow linearly:\n>>> timeit.timeit(\"'x' in ('x', 'y')\")\n0.04866674801541748\n>>> timeit.timeit(\"'x' == 'x' or 'x' == 'y'\")\n0.06565782838087131\n>>> timeit.timeit(\"'x' in ('y', 'x')\")\n0.08975995576448526\n>>> timeit.timeit(\"'x' == 'y' or 'x' == 'y'\")\n0.12992391047427532\n\nBased on this, I think I should totally start using in everywhere instead of ==!\n", "output": "As I mentioned to David Wolever, there's more to this than meets the eye; both methods dispatch to is; you can prove this by doing\nmin(Timer(\"x == x\", setup=\"x = 'a' * 1000000\").repeat(10, 10000))\n#>>> 0.00045456900261342525\n\nmin(Timer(\"x == y\", setup=\"x = 'a' * 1000000; y = 'a' * 1000000\").repeat(10, 10000))\n#>>> 0.5256857610074803\n\nThe first can only be so fast because it checks by identity.\nTo find out why one would take longer than the other, let's trace through execution.\nThey both start in ceval.c, from COMPARE_OP since that is the bytecode involved\nTARGET(COMPARE_OP) {\n PyObject *right = POP();\n PyObject *left = TOP();\n PyObject *res = cmp_outcome(oparg, left, right);\n Py_DECREF(left);\n Py_DECREF(right);\n SET_TOP(res);\n if (res == NULL)\n goto error;\n PREDICT(POP_JUMP_IF_FALSE);\n PREDICT(POP_JUMP_IF_TRUE);\n DISPATCH();\n}\n\nThis pops the values from the stack (technically it only pops one)\nPyObject *right = POP();\nPyObject *left = TOP();\n\nand runs the compare:\nPyObject *res = cmp_outcome(oparg, left, right);\n\ncmp_outcome is this:\nstatic PyObject *\ncmp_outcome(int op, PyObject *v, PyObject *w)\n{\n int res = 0;\n switch (op) {\n case PyCmp_IS: ...\n case PyCmp_IS_NOT: ...\n case PyCmp_IN:\n res = PySequence_Contains(w, v);\n if (res < 0)\n return NULL;\n break;\n case PyCmp_NOT_IN: ...\n case PyCmp_EXC_MATCH: ...\n default:\n return PyObject_RichCompare(v, w, op);\n }\n v = res ? Py_True : Py_False;\n Py_INCREF(v);\n return v;\n}\n\nThis is where the paths split. The PyCmp_IN branch does\nint\nPySequence_Contains(PyObject *seq, PyObject *ob)\n{\n Py_ssize_t result;\n PySequenceMethods *sqm = seq->ob_type->tp_as_sequence;\n if (sqm != NULL && sqm->sq_contains != NULL)\n return (*sqm->sq_contains)(seq, ob);\n result = _PySequence_IterSearch(seq, ob, PY_ITERSEARCH_CONTAINS);\n return Py_SAFE_DOWNCAST(result, Py_ssize_t, int);\n}\n\nNote that a tuple is defined as\nstatic PySequenceMethods tuple_as_sequence = {\n ...\n (objobjproc)tuplecontains, /* sq_contains */\n};\n\nPyTypeObject PyTuple_Type = {\n ...\n &tuple_as_sequence, /* tp_as_sequence */\n ...\n};\n\nSo the branch\nif (sqm != NULL && sqm->sq_contains != NULL)\n\nwill be taken and *sqm->sq_contains, which is the function (objobjproc)tuplecontains, will be taken.\nThis does\nstatic int\ntuplecontains(PyTupleObject *a, PyObject *el)\n{\n Py_ssize_t i;\n int cmp;\n\n for (i = 0, cmp = 0 ; cmp == 0 && i < Py_SIZE(a); ++i)\n cmp = PyObject_RichCompareBool(el, PyTuple_GET_ITEM(a, i),\n Py_EQ);\n return cmp;\n}\n\n...Wait, wasn't that PyObject_RichCompareBool what the other branch took? Nope, that was PyObject_RichCompare.\nThat code path was short so it likely just comes down to the speed of these two. Let's compare.\nint\nPyObject_RichCompareBool(PyObject *v, PyObject *w, int op)\n{\n PyObject *res;\n int ok;\n\n /* Quick result when objects are the same.\n Guarantees that identity implies equality. */\n if (v == w) {\n if (op == Py_EQ)\n return 1;\n else if (op == Py_NE)\n return 0;\n }\n\n ...\n}\n\nThe code path in PyObject_RichCompareBool pretty much immediately terminates. For PyObject_RichCompare, it does\nPyObject *\nPyObject_RichCompare(PyObject *v, PyObject *w, int op)\n{\n PyObject *res;\n\n assert(Py_LT <= op && op <= Py_GE);\n if (v == NULL || w == NULL) { ... }\n if (Py_EnterRecursiveCall(\" in comparison\"))\n return NULL;\n res = do_richcompare(v, w, op);\n Py_LeaveRecursiveCall();\n return res;\n}\n\nThe Py_EnterRecursiveCall/Py_LeaveRecursiveCall combo are not taken in the previous path, but these are relatively quick macros that'll short-circuit after incrementing and decrementing some globals.\ndo_richcompare does:\nstatic PyObject *\ndo_richcompare(PyObject *v, PyObject *w, int op)\n{\n richcmpfunc f;\n PyObject *res;\n int checked_reverse_op = 0;\n\n if (v->ob_type != w->ob_type && ...) { ... }\n if ((f = v->ob_type->tp_richcompare) != NULL) {\n res = (*f)(v, w, op);\n if (res != Py_NotImplemented)\n return res;\n ...\n }\n ...\n}\n\nThis does some quick checks to call v->ob_type->tp_richcompare which is\nPyTypeObject PyUnicode_Type = {\n ...\n PyUnicode_RichCompare, /* tp_richcompare */\n ...\n};\n\nwhich does\nPyObject *\nPyUnicode_RichCompare(PyObject *left, PyObject *right, int op)\n{\n int result;\n PyObject *v;\n\n if (!PyUnicode_Check(left) || !PyUnicode_Check(right))\n Py_RETURN_NOTIMPLEMENTED;\n\n if (PyUnicode_READY(left) == -1 ||\n PyUnicode_READY(right) == -1)\n return NULL;\n\n if (left == right) {\n switch (op) {\n case Py_EQ:\n case Py_LE:\n case Py_GE:\n /* a string is equal to itself */\n v = Py_True;\n break;\n case Py_NE:\n case Py_LT:\n case Py_GT:\n v = Py_False;\n break;\n default:\n ...\n }\n }\n else if (...) { ... }\n else { ...}\n Py_INCREF(v);\n return v;\n}\n\nNamely, this shortcuts on left == right... but only after doing \n if (!PyUnicode_Check(left) || !PyUnicode_Check(right))\n\n if (PyUnicode_READY(left) == -1 ||\n PyUnicode_READY(right) == -1)\n\nAll in all the paths then look something like this (manually recursively inlining, unrolling and pruning known branches)\nPOP() # Stack stuff\nTOP() #\n #\ncase PyCmp_IN: # Dispatch on operation\n #\nsqm != NULL # Dispatch to builtin op\nsqm->sq_contains != NULL #\n*sqm->sq_contains #\n #\ncmp == 0 # Do comparison in loop\ni < Py_SIZE(a) #\nv == w #\nop == Py_EQ #\n++i # \ncmp == 0 #\n #\nres < 0 # Convert to Python-space\nres ? Py_True : Py_False #\nPy_INCREF(v) #\n #\nPy_DECREF(left) # Stack stuff\nPy_DECREF(right) #\nSET_TOP(res) #\nres == NULL #\nDISPATCH() #\n\nvs\nPOP() # Stack stuff\nTOP() #\n #\ndefault: # Dispatch on operation\n #\nPy_LT <= op # Checking operation\nop <= Py_GE #\nv == NULL #\nw == NULL #\nPy_EnterRecursiveCall(...) # Recursive check\n #\nv->ob_type != w->ob_type # More operation checks\nf = v->ob_type->tp_richcompare # Dispatch to builtin op\nf != NULL #\n #\n!PyUnicode_Check(left) # ...More checks\n!PyUnicode_Check(right)) #\nPyUnicode_READY(left) == -1 #\nPyUnicode_READY(right) == -1 #\nleft == right # Finally, doing comparison\ncase Py_EQ: # Immediately short circuit\nPy_INCREF(v); #\n #\nres != Py_NotImplemented #\n #\nPy_LeaveRecursiveCall() # Recursive check\n #\nPy_DECREF(left) # Stack stuff\nPy_DECREF(right) #\nSET_TOP(res) #\nres == NULL #\nDISPATCH() #\n\nNow, PyUnicode_Check and PyUnicode_READY are pretty cheap since they only check a couple of fields, but it should be obvious that the top one is a smaller code path, it has fewer function calls, only one switch\nstatement and is just a bit thinner.\nTL;DR:\nBoth dispatch to if (left_pointer == right_pointer); the difference is just how much work they do to get there. in just does less.\n"} {"input": "How do I run Python code from Sublime Text 2? I want to set up a complete Python IDE in Sublime Text 2.\nI want to know how to run the Python code from within the editor. Is it done using build system? How do I do it ?\n", "output": "Tools -> Build System -> (choose) Python then:\nTo Run:\n Tools -> Build\n\n -or-\n\n Ctrl + B\n\n CMD + B (OSX)\n\nThis would start your file in the console which should be at the bottom of the editor.\nTo Stop:\n Ctrl + Break or Tools -> Cancel Build\n\nYou can find out where your Break key is here: http://en.wikipedia.org/wiki/Break_key. \nNote: CTRL + C will NOT work.\nWhat to do when Ctrl + Break does not work:\nGo to: \n\nPreferences -> Key Bindings - User\n\nand paste the line below: \n{\"keys\": [\"ctrl+shift+c\"], \"command\": \"exec\", \"args\": {\"kill\": true} } \n\nNow, you can use ctrl+shift+c instead of CTRL+BREAK\n"} {"input": "How to print a class or objects of class using print()? I am learning the ropes in Python. When I try to print an object of class Foobar using the print() function, I get an output like this:\n<__main__.Foobar instance at 0x7ff2a18c>\n\nIs there a way I can set the printing behaviour (or the string representation) of a class and its objects? For instance, when I call print() on a class object, I would like to print its data members in a certain format. How to achieve this in Python?\nIf you are familiar with C++ classes, the above can be achieved for the standard ostream by adding a friend ostream& operator << (ostream&, const Foobar&) method for the class.\n", "output": ">>> class Test:\n... def __repr__(self):\n... return \"Test()\"\n... def __str__(self):\n... return \"member of Test\"\n... \n>>> t = Test()\n>>> t\nTest()\n>>> print t\nmember of Test\n\nThe __str__ method is what happens when you print it, and the __repr__ method is what happens when you use the repr() function (or when you look at it with the interactive prompt). If this isn't the most Pythonic method, I apologize, because I'm still learning too - but it works.\nIf no __str__ method is given, Python will print the result of __repr__ instead. If you define __str__ but not __repr__, Python will use what you see above as the __repr__, but still use __str__ for printing.\n"} {"input": "Convert a list of characters into a string If I have a list of chars:\na = ['a','b','c','d']\n\nHow do I convert it into a single string?\na = 'abcd'\n\n", "output": "Use the join method of the empty string to join all of the strings together with the empty string in between, like so:\n>>> a = ['a', 'b', 'c', 'd']\n>>> ''.join(a)\n'abcd'\n\n"} {"input": "What's the idiomatic syntax for prepending to a short python list? list.append() is the obvious choice for adding to the end of a list. Here's a reasonable explanation for the missing list.prepend(). Assuming my list is short and performance concerns are negligible, is \nlist.insert(0, x)\n\nor \nlist[0:0] = [x]\n\nidiomatic?\n", "output": "The s.insert(0, x) form is the most common.\nWhenever you see it though, it may be time to consider using a collections.deque instead of a list.\n"} {"input": "Which exception should I raise on bad/illegal argument combinations in Python? I was wondering about the best practices for indicating invalid argument combinations in Python. I've come across a few situations where you have a function like so:\ndef import_to_orm(name, save=False, recurse=False):\n \"\"\"\n :param name: Name of some external entity to import.\n :param save: Save the ORM object before returning.\n :param recurse: Attempt to import associated objects as well. Because you\n need the original object to have a key to relate to, save must be\n `True` for recurse to be `True`.\n :raise BadValueError: If `recurse and not save`.\n :return: The ORM object.\n \"\"\"\n pass\n\nThe only annoyance with this is that every package has its own, usually slightly differing BadValueError. I know that in Java there exists java.lang.IllegalArgumentException -- is it well understood that everybody will be creating their own BadValueErrors in Python or is there another, preferred method?\n", "output": "I would just raise ValueError, unless you need a more specific exception..\ndef import_to_orm(name, save=False, recurse=False):\n if recurse and not save:\n raise ValueError(\"save must be True if recurse is True\")\n\nThere's really no point in doing class BadValueError(ValueError):pass - your custom class is identical in use to ValueError, so why not use that?\n"} {"input": "Python urlencode querystring I am trying to urlencode this string before I submit. \nqueryString = 'eventName=' + evt.fields[\"eventName\"] + '&' + 'eventDescription=' + evt.fields[\"eventDescription\"]; \n\n", "output": "Python 2\nWhat you're looking for is urllib.quote_plus:\n>>> urllib.quote_plus('string_of_characters_like_these:$#@=?%^Q^$')\n'string_of_characters_like_these%3A%24%23%40%3D%3F%25%5EQ%5E%24'\n\nPython 3\nIn Python 3, you the urllib package has been broken into smaller components. You'll use urllib.parse.quote_plus() (note the parse child module).\n"} {"input": "How to check Django version I have to use Python and Django for our application. So I have two versions of Python, 2.6 and 2.7. Now I have installed Django. I could run the sample application for testing Django succesfuly. But how do I make sure whether Django uses the 2.6 or 2.7 version and what version of modules Django uses?\n", "output": "Django 1.5 supports Python 2.6.5 and later.\nIf you're under Linux and want to check the Python version you're using, run python -V from the command line.\nIf you want to check the Django version, open a Python console and type\nimport django\ndjango.VERSION\n\n"} {"input": "Fastest way to check if a value exist in a list I'm searching for the fastest way to know if a value exists in a list (a list with millions of values in it) and what its index is? I know all values in the list are unique like my example.\nMy first methods I try is(3.8sec in my real code):\na = [4,2,3,1,5,6]\n\nif a.count(7) == 1:\n b=a.index(7)\n \"Do something with variable b\"\n\nMy second methods I try is (2x faster:1.9sec on my real code):\na = [4,2,3,1,5,6]\n\ntry:\n b=a.index(7)\nexcept ValueError:\n \"Do nothing\"\nElse:\n \"Do something with variable b\"\n\nProposed methods from S.O. user (2.74sec on my real code):\na = [4,2,3,1,5,6]\nif 7 in a:\n a.index(7)\n\nIn my real code , first method take 3.81sec and the second methods take 1.88sec.\nIt's a good improvement but:\nI'm a beginner with Python/scripting and I want to know if a fastest way exist to do the same things and save more process time?\nMore specific explication for my application:\nIn the API of blender a can access to a list of particles:\nparticles = [1,2,3,4...etc.]\n\nFrom there , I can access to it's location:\nparticles[x].location = [x,y,z]\n\nAnd I test for each particles if a neighbour exist by searching\nin the location of each particles like:\nif [x+1,y,z] in particles.location\n \"find the identity of this neighbour particles in x:the index \n of the particles array\"\n particles.index([x+1,y,z])\n\n", "output": "7 in a\n\nClearest and fastest way to do it.\nYou can also consider using a set, but constructing that set from your list may take more time than faster membership testing will save. The only way to be certain is to benchmark well. (this also depends on what operations you require)\n"} {"input": "How do I get my python program to sleep for 50 milliseconds? How do I get my python program to sleep for 50 milliseconds?\n", "output": "Python 2.5.2 (r252:60911, Jul 31 2008, 17:28:52)\n[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2\nType \"help\", \"copyright\", \"credits\" or \"license\" for more information.\n>>> from time import sleep\n>>> sleep(0.05)\n>>>\n\n"} {"input": "What's the difference between eval, exec, and compile in Python? I've been looking at dynamic evaluation of Python code, and come across the eval() and compile() functions, and the exec statement.\nCan someone please explain the difference between eval and exec, and how the different modes of compile() fit in?\n", "output": "\nexec is not an expression: a statement in Python 2.x, and a function in Python 3.x. It compiles and immediately evaluates a statement or set of statement contained in a string. Example:\nexec('print(5)') # prints 5.\n# exec 'print 5' if you use Python 2.x, nor the exec neither the print is a function there\nexec('print(5)\\nprint(6)') # prints 5{newline}6.\nexec('if True: print(6)') # prints 6.\nexec('5') # does nothing and returns nothing.\n\neval is a built-in function (not a statement), which evaluates an expression and returns the value that expression produces. Example:\nx = eval('5') # x <- 5\nx = eval('%d + 6' % x) # x <- 11\nx = eval('abs(%d)' % -100) # x <- 100\nx = eval('x = 5') # INVALID; assignment is not an expression.\nx = eval('if 1: x = 4') # INVALID; if is a statement, not an expression.\n\ncompile is a lower level version of exec and eval. It does not execute or evaluate your statements or expressions, but returns a code object that can do it. The modes are as follows:\n\ncompile(string, '', 'eval') returns the code object that would have been executed had you done eval(string). Note that you cannot use statements in this mode; only a (single) expression is valid.\ncompile(string, '', 'exec') returns the code object that would have been executed had you done exec(string). You can use any number of statements here.\ncompile(string, '', 'single') is like the exec mode, but it will ignore everything except for the first statement. Note that an if/else statement with its results is considered a single statement.\n\n\n"} {"input": "How to find out the number of CPUs using python I want to know the number of CPUs on the local machine using Python. The result should be user/real as output by time(1) when called with an optimally scaling userspace-only program.\n", "output": "If you have python with a version >= 2.6 you can simply use\nimport multiprocessing\n\nmultiprocessing.cpu_count()\n\nhttp://docs.python.org/library/multiprocessing.html#multiprocessing.cpu_count\n"} {"input": "What is the best way to remove accents in a Python unicode string? I have a Unicode string in Python, and I would like to remove all the accents (diacritics).\nI found on the Web an elegant way to do this in Java:\n\nconvert the Unicode string to its long normalized form (with a separate character for letters and diacritics)\nremove all the characters whose Unicode type is \"diacritic\".\n\nDo I need to install a library such as pyICU or is this possible with just the python standard library? And what about python 3?\nImportant note: I would like to avoid code with an explicit mapping from accented characters to their non-accented counterpart.\n", "output": "How about this:\nimport unicodedata\ndef strip_accents(s):\n return ''.join(c for c in unicodedata.normalize('NFD', s)\n if unicodedata.category(c) != 'Mn')\n\nThis works on greek letters, too:\n>>> strip_accents(u\"A \\u00c0 \\u0394 \\u038E\")\nu'A A \\u0394 \\u03a5'\n>>> \n\nThe character category \"Mn\" stands for Nonspacing_Mark, which is similar to unicodedata.combining in MiniQuark's answer (I didn't think of unicodedata.combining, but it is probably the better solution, because it's more explicit).\nAnd keep in mind, these manipulations may significantly alter the meaning of the text. Accents, Umlauts etc. are not \"decoration\".\n"} {"input": "Simple Digit Recognition OCR in OpenCV-Python I am trying to implement a \"Digit Recognition OCR\" in OpenCV-Python (cv2). It is just for learning purposes. I would like to learn both KNearest and SVM features in OpenCV. \nI have 100 samples (i.e. images) of each digit. I would like to train with them.\nThere is a sample letter_recog.py that comes with OpenCV sample. But i still couldn't figure out on how to use it. I don't understand what are the samples, responses etc. Also, it loads a txt file at first, which i didn't understand first.\nLater on searching a little bit, i could find a letter_recognition.data in cpp samples. I used it and made a code for cv2.KNearest in the model of letter_recog.py (just for testing):\nimport numpy as np\nimport cv2\n\nfn = 'letter-recognition.data'\na = np.loadtxt(fn, np.float32, delimiter=',', converters={ 0 : lambda ch : ord(ch)-ord('A') })\nsamples, responses = a[:,1:], a[:,0]\n\nmodel = cv2.KNearest()\nretval = model.train(samples,responses)\nretval, results, neigh_resp, dists = model.find_nearest(samples, k = 10)\nprint results.ravel()\n\nIt gave me an array of size 20000, i don't understand what it is.\nQuestions:\n1) What is letter_recognition.data file ? How to build that file from my own data set?\n2) What does results.reval() denote? \n3) How we can write a simple digit recognition tool using letter_recognition.data file (either KNearest or SVM)?\n", "output": "Well, I decided to workout myself on my question to solve above problem. What i wanted is to implement a simpl OCR using KNearest or SVM features in OpenCV. And below is what i did and how. ( it is just for learning how to use KNearest for simple OCR purposes).\n1) My first question was about letter_recognition.data file that comes with OpenCV samples. I wanted to know what is inside that file.\nIt contains a letter, along with 16 features of that letter.\nAnd this SOF helped me to find it. These 16 features are explained in the paperLetter Recognition Using Holland-Style Adaptive Classifiers.\n( Although i didn't understand some of the features at end)\n2) Since i knew, without understanding all those features, it is difficult to do that method. i tried some other papers, but all were a little difficult for a beginner.\nSo I just decided to take all the pixel values as my features. (I was not worried about accuracy or performance, i just wanted it to work, at least with the least accuracy)\nI took below image for my training data:\n\n( I know the amount of training data is less. But, since all letters are of same font and size, i decided to try on this).\nTo prepare the data for training, i made a small code in OpenCV. It does following things:\na) It loads the image.\nb) Selects the digits ( obviously by contour finding and applying constraints on area and height of letters to avoid false detections).\nc) Draws the bounding rectangle around one letter and wait for key press manually. This time we press the digit key ourselves corresponding to the letter in box.\nd) Once corresponding digit key is pressed, it resizes this box to 10x10 and saves 100 pixel values in an array (here, samples) and corresponding manually entered digit in another array(here, responses).\ne) Then save both the arrays in separate txt files.\nAt the end of manual classification of digits, all the digits in the train data( train.png) are labeled manually by ourselves, image will look like below:\n\nBelow is the code i used for above purpose ( of course, not so clean):\nimport sys\n\nimport numpy as np\nimport cv2\n\nim = cv2.imread('pitrain.png')\nim3 = im.copy()\n\ngray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)\nblur = cv2.GaussianBlur(gray,(5,5),0)\nthresh = cv2.adaptiveThreshold(blur,255,1,1,11,2)\n\n################# Now finding Contours ###################\n\ncontours,hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)\n\nsamples = np.empty((0,100))\nresponses = []\nkeys = [i for i in range(48,58)]\n\nfor cnt in contours:\n if cv2.contourArea(cnt)>50:\n [x,y,w,h] = cv2.boundingRect(cnt)\n\n if h>28:\n cv2.rectangle(im,(x,y),(x+w,y+h),(0,0,255),2)\n roi = thresh[y:y+h,x:x+w]\n roismall = cv2.resize(roi,(10,10))\n cv2.imshow('norm',im)\n key = cv2.waitKey(0)\n\n if key == 27: # (escape to quit)\n sys.exit()\n elif key in keys:\n responses.append(int(chr(key)))\n sample = roismall.reshape((1,100))\n samples = np.append(samples,sample,0)\n\nresponses = np.array(responses,np.float32)\nresponses = responses.reshape((responses.size,1))\nprint \"training complete\"\n\nnp.savetxt('generalsamples.data',samples)\nnp.savetxt('generalresponses.data',responses)\n\n\nNow we enter in to training and testing part.\nFor testing part i used below image, which has same type of letters i used to train.\n\nFor training we do as follows:\na) Load the txt files we already saved earlier\nb) create a instance of classifier we are using ( here, it is KNearest)\nc) Then we use KNearest.train function to train the data\nFor testing purposes, we do as follows:\na) We load the image used for testing\nb) process the image as earlier and extract each digit using contour methods\nc) Draw bounding box for it, then resize to 10x10, and store its pixel values in an array as done earlier. \nd) Then we use KNearest.find_nearest() function to find the nearest item to the one we gave. ( If lucky, it recognises the correct digit.)\nI included last two steps ( training and testing) in single code below:\nimport cv2\nimport numpy as np\n\n####### training part ############### \nsamples = np.loadtxt('generalsamples.data',np.float32)\nresponses = np.loadtxt('generalresponses.data',np.float32)\nresponses = responses.reshape((responses.size,1))\n\nmodel = cv2.KNearest()\nmodel.train(samples,responses)\n\n############################# testing part #########################\n\nim = cv2.imread('pi.png')\nout = np.zeros(im.shape,np.uint8)\ngray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)\nthresh = cv2.adaptiveThreshold(gray,255,1,1,11,2)\n\ncontours,hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)\n\nfor cnt in contours:\n if cv2.contourArea(cnt)>50:\n [x,y,w,h] = cv2.boundingRect(cnt)\n if h>28:\n cv2.rectangle(im,(x,y),(x+w,y+h),(0,255,0),2)\n roi = thresh[y:y+h,x:x+w]\n roismall = cv2.resize(roi,(10,10))\n roismall = roismall.reshape((1,100))\n roismall = np.float32(roismall)\n retval, results, neigh_resp, dists = model.find_nearest(roismall, k = 1)\n string = str(int((results[0][0])))\n cv2.putText(out,string,(x,y+h),0,1,(0,255,0))\n\ncv2.imshow('im',im)\ncv2.imshow('out',out)\ncv2.waitKey(0)\n\nAnd it worked , below is the result i got:\n\n\nHere it worked with 100% accuracy, for which the reason, i assume, is all digits are of same kind and same size.\nBut any way, this is a good start to go for beginners ( i hope so).\n"} {"input": "How do I make python to wait for a pressed key I want my script to wait until the user presses any key.\nHow do I do that?\n", "output": "One way to do this in Python 2, is to use raw_input():\nraw_input(\"Press Enter to continue...\")\n\nIn python3 it's just input()\n"} {"input": "Find all packages installed with easy_install/pip? Is there a way to find all Python PyPI packages that were installed with easy_install or pip? I mean, excluding everything that was/is installed with the distributions tools (in this case apt-get on Debian).\n", "output": "pip freeze will output a list of installed packages and their versions. It also allows you to write those packages to a file that can later be used to set up a new environment.\nhttp://www.pip-installer.org/en/latest/index.html#freezing-requirements\n"} {"input": "Short Description of Scoping Rules What exactly are the Python scoping rules?\nIf I have some code:\ncode1\nclass Foo:\n code2\n def spam.....\n code3\n for code4..:\n code5\n x()\n\nWhere is x found? Some possible choices include the list above:\n\nIn the enclosing source file\nIn the class namespace\nIn the function definition\nIn the for loop index variable\nInside the for loop\n\nAlso there is the context during execution, when the function spam is passed somewhere else. And maybe lambda functions pass a bit differently?\nThere must be a simple reference or algorithm somewhere. It's a confusing world for intermediate Python programmers.\n", "output": "Actually, a concise rule for Python Scope resolution, from Learning Python, 3rd. Ed.. (These rules are specific to variable names, not attributes. If you reference it without a period, these rules apply)\nLEGB Rule.\nL, Local \u00e2\u0080\u0094 Names assigned in any way within a function (def or lambda)), and not declared global in that function.\nE, Enclosing function locals \u00e2\u0080\u0094 Name in the local scope of any and all enclosing functions (def or lambda), from inner to outer.\nG, Global (module) \u00e2\u0080\u0094 Names assigned at the top-level of a module file, or declared global in a def within the file.\nB, Built-in (Python) \u00e2\u0080\u0094 Names preassigned in the built-in names module : open,range,SyntaxError,...\nSo, in the case of\ncode1\nclass Foo:\n code2\n def spam.....\n code3\n for code4..:\n code5\n x()\n\nThe for loop does not have its own namespace. In LEGB order, the scopes would be \nL : local, in the current def.\nE : Enclosed function, any enclosing functions (if def spam was in another def)\nG : Global. Were there any declared globally in the module?\nB : Any builtin x() in Python.\n"} {"input": "Does Python have an ordered set? Python has an ordered dictionary, what about an ordered set?\n", "output": "There is an ordered set recipe for this which is referred to from the Python 2 Documentation. This runs on Py2.6 or later and 3.0 or later without any modifications. The interface is almost exactly the same as a normal set, except that initialisation should be done with a list.\nOrderedSet([1, 2, 3])\n\nThis is a MutableSet, so the signature for .union doesn't match that of set, but since it includes __or__ something similar can easily be added:\n@staticmethod\ndef union(*sets):\n union = OrderedSet()\n union.union(*sets)\n return union\n\ndef union(self, *sets):\n for set in sets:\n self |= set\n\n"} {"input": "time.sleep -- sleeps thread or process? In Python for the *nix, does time.sleep() block the thread or the process?\n", "output": "It blocks the thread. If you look in Modules/timemodule.c in the Python source, you'll see that in the call to floatsleep(), the substantive part of the sleep operation is wrapped in a Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS block, allowing other threads to continue to execute while the current one sleeps. You can also test this with a simple python program:\nimport time\nfrom threading import Thread\n\nclass worker(Thread):\n def run(self):\n \tfor x in xrange(0,11):\n \t\tprint x\n \t\ttime.sleep(1)\n\nclass waiter(Thread):\n def run(self):\n \tfor x in xrange(100,103):\n \t\tprint x\n \t\ttime.sleep(5)\n\ndef run():\n worker().start()\n waiter().start()\n\nWhich will print:\n>>> thread_test.run()\n0\n100\n>>> 1\n2\n3\n4\n5\n101\n6\n7\n8\n9\n10\n102\n\n"} {"input": "Python list of lists, changes reflected across sublists unexpectedly I needed to create a list of lists in Python, so I typed the following: \nmyList = [[1] * 4] * 3\n\nThe list looked like this: \n[[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]] \n\nThen I changed one of the innermost values: \nmyList[0][0] = 5\n\nNow my list looks like this: \n[[5, 1, 1, 1], [5, 1, 1, 1], [5, 1, 1, 1]] \n\nwhich is not what I wanted or expected. Can someone please explain what's going on, and how to get around it?\n", "output": "When you write [x]*3 you get, essentially, the list [x, x, x]. That is, a list with 3 references to the same x. When you then modify this single x it is visible via all three references to it.\nTo fix it, you need to make sure that you create a new list at each position. One way to do it is\n[[1]*4 for n in range(3)]\n\n"} {"input": "How to break out of multiple loops in Python? Given the following code (that doesn't work):\nwhile True:\n #snip: print out current state\n while True:\n ok = get_input(\"Is this ok? (y/n)\")\n if ok == \"y\" or ok == \"Y\": break 2 #this doesn't work :(\n if ok == \"n\" or ok == \"N\": break\n #do more processing with menus and stuff\n\nIs there a way to make this work? Or do I have do one check to break out of the input loop, then another, more limited, check in the outside loop to break out all together if the user is satisfied?\nEdit-FYI: get_input is a short function I wrote that supports showing a prompt and default values and all that fanciness and returns stdin.readline().strip()\n", "output": "My first instinct would be to refactor the nested loop into a function and use return to break out. \n"} {"input": "How to install python3 version of package via pip on Ubuntu? I have both python2.7 and python3.2 installed in Ubuntu 12.04.\nThe symbolic link python links to python2.7.\nWhen I type:\nsudo pip install package-name\n\nIt will default install python2 version of package-name.\nSome package supports both python2 and python3.\nHow to install python3 version of package-name via pip?\n", "output": "Ubuntu 12.10+ and Fedora 13+ have a package called python3-pip which will install pip-3.2 (or pip-3.3, pip-3.4 or pip3 for newer versions) without needing this jumping through hoops.\n\nI came across this and fixed this without needing the likes of wget or virtualenvs (assuming Ubuntu 12.04):\n\nInstall package python3-setuptools: run sudo aptitude install python3-setuptools, this will give you the command easy_install3.\nInstall pip using Python 3's setuptools: run sudo easy_install3 pip, this will give you the command pip-3.2 like kev's solution.\nInstall your PyPI packages: run sudo pip-3.2 install (installing python packages into your base system requires root, of course).\n\u00e2\u0080\u00a6\nProfit!\n\n"} {"input": "In Matplotlib, what does the argument mean in fig.add_subplot(111)? Sometimes I come across code such as this:\nimport matplotlib.pyplot as plt\nx = [1, 2, 3, 4, 5]\ny = [1, 4, 9, 16, 25]\nfig = plt.figure()\nfig.add_subplot(111)\nplt.scatter(x, y)\nplt.show()\n\nWhich produces:\n\nI've been reading the documentation like crazy but I can't find an explanation for the 111. sometimes I see a 212.\nWhat does the argument of fig.add_subplot() mean?\n", "output": "I think this would be best explained by the following picture:\n\nTo initialize the above, one would type:\nimport matplotlib.pyplot as plt\nfig = plt.figure()\nfig.add_subplot(221) #top left\nfig.add_subplot(222) #top right\nfig.add_subplot(223) #bottom left\nfig.add_subplot(224) #bottom right \nplt.show()\n\nEDIT: Some additional information\nThe following combinations produce asymmetrical arrangements of subplots.\nsubplot(2,2,[1 3])\nsubplot(2,2,2)\nsubplot(2,2,4)\n\n\nYou can also use the colon operator to specify multiple locations if they are in sequence.\nsubplot(2,2,1:2)\nsubplot(2,2,3)\nsubplot(2,2,4)\n\n\nReference here\n"} {"input": "python open built-in function: difference between modes a, a+, w, w+, and r+? In the python built-in open function, what is the exact difference between the modes w, a, w+, a+, and r+?\nIn particular, the documentation implies that all of these will allow writing to the file, and says that it opens the files for \"appending\", \"writing\", and \"updating\" specifically, but does not define what these terms mean.\n", "output": "The opening modes are exactly the same that C fopen() std library function.\nThe BSD fopen manpage defines them as follows:\n The argument mode points to a string beginning with one of the following\n sequences (Additional characters may follow these sequences.):\n\n ``r'' Open text file for reading. The stream is positioned at the\n beginning of the file.\n\n ``r+'' Open for reading and writing. The stream is positioned at the\n beginning of the file.\n\n ``w'' Truncate file to zero length or create text file for writing.\n The stream is positioned at the beginning of the file.\n\n ``w+'' Open for reading and writing. The file is created if it does not\n exist, otherwise it is truncated. The stream is positioned at\n the beginning of the file.\n\n ``a'' Open for writing. The file is created if it does not exist. The\n stream is positioned at the end of the file. Subsequent writes\n to the file will always end up at the then current end of file,\n irrespective of any intervening fseek(3) or similar.\n\n ``a+'' Open for reading and writing. The file is created if it does not\n exist. The stream is positioned at the end of the file. Subse-\n quent writes to the file will always end up at the then current\n end of file, irrespective of any intervening fseek(3) or similar.\n\n"} {"input": "How do I determine if my python shell is executing in 32bit or 64bit mode on OS X? I need a way to tell what mode the shell is in from within the shell. \nI've tried looking at the platform module but it seems only to tell you about \"about the bit architecture and the linkage format used for the executable\": the binary is compiled as 64bit though (I'm running on OS X 10.6) so it seems to always report 64bit even though I'm using the methods described here to force 32bit mode).\n", "output": "UPDATED:\nOne way is to look at sys.maxsize as documented here:\n$ python-32 -c 'import sys;print(\"%x\" % sys.maxsize, sys.maxsize > 2**32)'\n('7fffffff', False)\n$ python-64 -c 'import sys;print(\"%x\" % sys.maxsize, sys.maxsize > 2**32)'\n('7fffffffffffffff', True)\n\nsys.maxsize was introduced in Python 2.6. If you need a test for older systems, this slightly more complicated test should work on all Python 2 and 3 releases:\n$ python-32 -c 'import struct;print( 8 * struct.calcsize(\"P\"))'\n32\n$ python-64 -c 'import struct;print( 8 * struct.calcsize(\"P\"))'\n64\n\nBTW, you might be tempted to use platform.architecture() for this. Unfortunately, its results are not always reliable, particularly in the case of OS X universal binaries.\n$ arch -x86_64 /usr/bin/python2.6 -c 'import sys,platform; print platform.architecture()[0], sys.maxsize > 2**32'\n64bit True\n$ arch -i386 /usr/bin/python2.6 -c 'import sys,platform; print platform.architecture()[0], sys.maxsize > 2**32'\n64bit False\n\n"} {"input": "Setting the correct encoding when piping stdout in Python When piping the output of a Python program, the Python interpreter gets confused about encoding and sets it to None. This means a program like this:\n# -*- coding: utf-8 -*-\nprint u\"\u00c3\u00a5\u00c3\u00a4\u00c3\u00b6\"\n\nwill work fine when run normally, but fail with:\n\nUnicodeEncodeError: 'ascii' codec can't encode character u'\\xa0' in position 0: ordinal not in range(128)\n\nwhen used in a pipe sequence.\nWhat is the best way to make this work when piping? Can I just tell it to use whatever encoding the shell/filesystem/whatever is using? \nThe suggestions I have seen thus far is to modify your site.py directly, or hardcoding the defaultencoding using this hack:\n# -*- coding: utf-8 -*-\nimport sys\nreload(sys)\nsys.setdefaultencoding('utf-8')\nprint u\"\u00c3\u00a5\u00c3\u00a4\u00c3\u00b6\"\n\nIs there a better way to make piping work?\n", "output": "First, regarding this solution:\n# -*- coding: utf-8 -*-\nprint u\"\u00c3\u00a5\u00c3\u00a4\u00c3\u00b6\".encode('utf-8')\n\nIt's not practical to explicitly print with a given encoding every time. That would be repetitive and error-prone.\nA better solution is to change sys.stdout at the start of your program, to encode with a selected encoding. Here is one solution I found on Python: How is sys.stdout.encoding chosen?, in particular a comment by \"toka\":\nimport sys\nimport codecs\nsys.stdout = codecs.getwriter('utf8')(sys.stdout)\n\n"} {"input": "Difference between Python's Generators and Iterators What is the difference between iterators and generators? Some examples for when you would use each case would be helpful.\n", "output": "iterator is a more general concept: any object whose class has a next method (__next__ in Python 3) and an __iter__ method that does return self.\nEvery generator is an iterator, but not vice versa. A generator is built by calling a function that has one or more yield expressions (yield statements, in Python 2.5 and earlier), and is an object that meets the previous paragraph's definition of an iterator.\nYou may want to use a custom iterator, rather than a generator, when you need a class with somewhat complex state-maintaining behavior, or want to expose other methods besides next (and __iter__ and __init__). Most often, a generator (sometimes, for sufficiently simple needs, a generator expression) is sufficient, and it's simpler to code because state maintenance (within reasonable limits) is basically \"done for you\" by the frame getting suspended and resumed.\nFor example, a generator such as:\ndef squares(start, stop):\n for i in xrange(start, stop):\n yield i * i\n\ngenerator = squares(a, b)\n\nor the equivalent generator expression (genexp)\ngenerator = (i*i for i in xrange(a, b))\n\nwould take more code to build as a custom iterator:\nclass Squares(object):\n def __init__(self, start, stop):\n self.start = start\n self.stop = stop\n def __iter__(self): return self\n def next(self):\n if self.start >= self.stop:\n raise StopIteration\n current = self.start * self.start\n self.start += 1\n return current\n\niterator = Squares(a, b)\n\nBut, of course, with class Squares you could easily offer extra methods, i.e.\n def current(self):\n return self.start\n\nif you have any actual need for such extra functionality in your application.\n"} {"input": "Python: user input and commandline arguments How do I have a Python script that can accept user input (assuming this is possible) and how do I make it read in arguments if run from the command line?\n", "output": "var = raw_input(\"Please enter something: \")\nprint \"you entered\", var\n\n"} {"input": "Elegant ways to support equivalence (\"equality\") in Python classes When writing custom classes it is often important to allow equivalence by means of the == and != operators. In Python, this is made possible by implementing the __eq__ and __ne__ special methods, respectively. The easiest way I've found to do this is the following method:\nclass Foo:\n def __init__(self, item):\n self.item = item\n\n def __eq__(self, other):\n if isinstance(other, self.__class__):\n return self.__dict__ == other.__dict__\n else:\n return False\n\n def __ne__(self, other):\n return not self.__eq__(other)\n\nDo you know of more elegant means of doing this? Do you know of any particular disadvantages to using the above method of comparing __dict__s?\nNote: A bit of clarification--when __eq__ and __ne__ are undefined, you'll find this behavior:\n>>> a = Foo(1)\n>>> b = Foo(1)\n>>> a is b\nFalse\n>>> a == b\nFalse\n\nThat is, a == b evaluates to False because it really runs a is b, a test of identity (i.e., \"Is a the same object as b?\").\nWhen __eq__ and __ne__ are defined, you'll find this behavior (which is the one we're after):\n>>> a = Foo(1)\n>>> b = Foo(1)\n>>> a is b\nFalse\n>>> a == b\nTrue\n\n", "output": "You need to be careful with inheritance:\n>>> class Foo:\n def __eq__(self, other):\n if isinstance(other, self.__class__):\n return self.__dict__ == other.__dict__\n else:\n return False\n\n>>> class Bar(Foo):pass\n\n>>> b = Bar()\n>>> f = Foo()\n>>> f == b\nTrue\n>>> b == f\nFalse\n\nCheck types more strictly, like this:\ndef __eq__(self, other):\n if type(other) is type(self):\n return self.__dict__ == other.__dict__\n return False\n\nBesides that, your approach will work fine, that's what special methods are there for.\n"} {"input": "How to use 'super' in Python? Can someone explain to me the difference between doing:\nclass Child(SomeBaseClass):\n def __init__(self):\n super(Child, self).__init__()\n\nand this:\nclass Child(SomeBaseClass):\n def __init__(self):\n SomeBaseClass.__init__(self)\n\nI've seen super being used quite a lot in classes with only single inheritance. I can see why you'd use it in multiple inheritance but am unclear as to what the advantages are of using it in this kind of situation.\n", "output": "The benefits of super() in single-inheritance are minimal -- mostly, you don't have to hard-code the name of the base class into every method that uses its parent methods.\nHowever, it's almost impossible to use multiple-inheritance without super(). This includes common idioms like mixins, interfaces, abstract classes, etc. This extends to code that later extends yours. If somebody later wanted to write a class that extended Child and a mixin, their code would not work properly.\n"} {"input": "How do I use raw_input in Python 3 import sys\nprint (sys.platform)\nprint (2 ** 100)\nraw_input( )\n\nI am using Python 3.1 and can't get the raw_input to \"freeze\" the dos pop-up. The book I'm reading is for 2.5 and I'm using 3.1\nWhat should I do to fix this?\n", "output": "raw_input() was renamed to input()\nFrom http://docs.python.org/dev/py3k/whatsnew/3.0.html\n"} {"input": "How do you convert a Python time.struct_time object into a datetime object? How do you convert a Python time.struct_time object into a datetime.datetime object?\nI have a library that provides the first one and a second library that wants the second one...\n", "output": "Use time.mktime() to convert the time tuple (in localtime) into seconds since the Epoch, then use datetime.fromtimestamp() to get the datetime object.\nfrom time import mktime\nfrom datetime import datetime\n\ndt = datetime.fromtimestamp(mktime(struct))\n\n"} {"input": "Getting \"Error loading MySQLdb module: No module named MySQLdb\" - have tried previously posted solutions This is a much discussed issue for OSX 10.6 users, but I haven't been able to find a solution that works. Here's my setup:\nPython 2.6.1 64bit\nDjango 1.2.1\nMySQL 5.1.47 osx10.6 64bit\nI create a virtualenvwrapper with --no-site-packages, then installed Django. When I activate the virtualenv and run python manage.py syncdb, I get this error:\nTraceback (most recent call last):\nFile \"manage.py\", line 11, in \n execute_manager(settings)\nFile \"/Users/joerobinson/.virtualenvs/dj_tut/lib/python2.6/site-packages/django/core/management/__init__.py\", line 438, in execute_manager\n utility.execute()\nFile \"/Users/joerobinson/.virtualenvs/dj_tut/lib/python2.6/site-packages/django/core/management/__init__.py\", line 379, in execute\n self.fetch_command(subcommand).run_from_argv(self.argv)\nFile \"/Users/joerobinson/.virtualenvs/dj_tut/lib/python2.6/site-packages/django/core/management/__init__.py\", line 257, in fetch_command\n klass = load_command_class(app_name, subcommand)\nFile \"/Users/joerobinson/.virtualenvs/dj_tut/lib/python2.6/site-packages/django/core/management/__init__.py\", line 67, in load_command_class\n module = import_module('%s.management.commands.%s' % (app_name, name))\nFile \"/Users/joerobinson/.virtualenvs/dj_tut/lib/python2.6/site-packages/django/utils/importlib.py\", line 35, in import_module\n __import__(name)\nFile \"/Users/joerobinson/.virtualenvs/dj_tut/lib/python2.6/site-packages/django/core/management/commands/syncdb.py\", line 7, in \nfrom django.core.management.sql import custom_sql_for_model, emit_post_sync_signal\nFile \"/Users/joerobinson/.virtualenvs/dj_tut/lib/python2.6/site-packages/django/core/management/sql.py\", line 5, in \nfrom django.contrib.contenttypes import generic\nFile \"/Users/joerobinson/.virtualenvs/dj_tut/lib/python2.6/site-packages/django/contrib/contenttypes/generic.py\", line 6, in \n from django.db import connection\nFile \"/Users/joerobinson/.virtualenvs/dj_tut/lib/python2.6/site-packages/django/db/__init__.py\", line 75, in \n connection = connections[DEFAULT_DB_ALIAS]\nFile \"/Users/joerobinson/.virtualenvs/dj_tut/lib/python2.6/site-packages/django/db/utils.py\", line 91, in __getitem__\n backend = load_backend(db['ENGINE'])\nFile \"/Users/joerobinson/.virtualenvs/dj_tut/lib/python2.6/site-packages/django/db/utils.py\", line 32, in load_backend\n return import_module('.base', backend_name)\nFile \"/Users/joerobinson/.virtualenvs/dj_tut/lib/python2.6/site-packages/django/utils/importlib.py\", line 35, in import_module\n __import__(name)\nFile \"/Users/joerobinson/.virtualenvs/dj_tut/lib/python2.6/site-packages/django/db/backends/mysql/base.py\", line 14, in \n raise ImproperlyConfigured(\"Error loading MySQLdb module: %s\" % e)\ndjango.core.exceptions.ImproperlyConfigured: Error loading MySQLdb module: No module named MySQLdb\n\nI've also installed the MySQL for Python adapter, but to no avail (maybe I installed it improperly?). \nAnyone dealt with this before?\n", "output": "I had the same error and pip install MySQL-python solved it for me.\nAlternate installs:\n\nIf you don't have pip, easy_install MySQL-python should work.\nIf your python is managed by a packaging system, you might have to use\nthat system (e.g. sudo apt-get install ...)\n\nBelow, Soli notes that if you receive the following error:\n\nEnvironmentError: mysql_config not found\n\n... then you have a further system dependency issue. Solving this will vary from system to system, but for Debian-derived systems:\nsudo apt-get install python-mysqldb\n"} {"input": "What is the python \"with\" statement designed for? I came across the Python with statement for the first time today. I've been using Python lightly for several months and didn't even know of its existence! Given its somewhat obscure status, I thought it would be worth asking:\n\nWhat is the Python with statement\ndesigned to be used for? \nWhat do\nyou use it for? \nAre their any\ngotchas I need to be aware of, or\ncommon anti-patterns associated with\nits use? Any cases where it is better use try..finally than with?\nWhy isn't it used more widely?\nWhich standard library classes are compatible with it?\n\n", "output": "\nI believe this has already been answered by other users before me, so I only add it for the sake of completeness: the with statement simplifies exception handling by encapsulating common preparation and cleanup tasks in so-called context managers. More details can be found in PEP 343. For instance, the open statement is a context manager in itself, which lets you open a file, keep it open as long as the execution is in the context of the with statement where you used it, and close it as soon as you leave the context, no matter whether you have left it because of an exception or during regular control flow. The with statement can thus be used in ways similar to the RAII pattern in C++: some resource is acquired by the with statement and released when you leave the with context.\nSome examples are: opening files using with open(filename) as fp:, acquiring locks using with lock: (where lock is an instance of threading.Lock). You can also construct your own context managers using the contextmanager decorator from contextlib. For instance, I often use this when I have to change the current directory temporarily and then return to where I was:\nfrom contextlib import contextmanager\nimport os\n\n\n@contextmanager\ndef working_directory(path):\n current_dir = os.getcwd()\n os.chdir(path)\n try:\n yield\n finally:\n os.chdir(current_dir)\n\n\nwith working_directory(\"data/stuff\"):\n # do something within data/stuff\n# here I am back again in the original working directory\n\nHere's another example that temporarily redirects sys.stdin, sys.stdout and sys.stderr to some other file handle and restores them later:\nfrom contextlib import contextmanager\nimport sys\n\n\n@contextmanager\ndef redirected(**kwds):\n stream_names = [\"stdin\", \"stdout\", \"stderr\"]\n old_streams = {}\n try:\n for sname in stream_names:\n stream = kwds.get(sname, None)\n if stream is not None and stream != getattr(sys, sname):\n old_streams[sname] = getattr(sys, sname)\n setattr(sys, sname, stream)\n yield\n finally:\n for sname, stream in old_streams.iteritems():\n setattr(sys, sname, stream)\n\n\nwith redirected(stdout=open(\"/tmp/log.txt\", \"w\")):\n # these print statements will go to /tmp/log.txt\n print \"Test entry 1\"\n print \"Test entry 2\"\n# back to the normal stdout\nprint \"Back to normal stdout again\"\n\nAnd finally, another example that creates a temporary folder and cleans it up when leaving the context:\nfrom tempfile import mkdtemp\nfrom shutil import rmtree\n\n\n@contextmanager\ndef temporary_dir(*args, **kwds):\n name = mkdtemp(*args, **kwds)\n try:\n yield name\n finally:\n shutil.rmtree(name)\n\n\nwith temporary_dir() as dirname:\n # do whatever you want\n\n\n"} {"input": "Python: How to import other Python files How do I import other files in Python?\n\nHow exactly can I import a specific python file like import file.py?\nHow to import a folder instead of a specific file?\nI want to load a Python file dynamically in runtime, based on user\ninput.\nI want to know how to take from the file just one specific part.\n\nFor example, in main.py I have:\nfrom extra import * \n\nalthough this gives me all the file actions of extra.py when maybe all I want is \na single definition:\ndef gap():\n print\n print\n\nWhat do I add to the import statement to just get gap from extra.py?\n", "output": "There are many ways to import a python file, all with their pros and cons.\nDon't just hastily pick the first import strategy that works for you or else you'll have to rewrite the codebase later on when you find it doesn't meet your needs. \nI'll start out explaining the easiest example #1, then I'll move toward the most professional and robust example #5\nExample 1, Import a python module with python interpreter:\n\nPut this in /home/el/foo/fox.py:\ndef what_does_the_fox_say():\n print(\"vixens cry\")\n\nGet into the python interpreter: \nel@apollo:/home/el/foo$ python\nPython 2.7.3 (default, Sep 26 2013, 20:03:06) \n>>> import fox\n>>> fox.what_does_the_fox_say()\nvixens cry\n>>> \n\nYou imported fox through the python interpreter, invoked the python function what_does_the_fox_say() from within fox.py. \n\nExample 2, Use execfile or (exec in Python 3) in a script to execute the other python file in place:\n\nPut this in /home/el/foo2/mylib.py:\ndef moobar():\n print(\"hi\")\n\nPut this in /home/el/foo2/main.py:\nexecfile(\"/home/el/foo2/mylib.py\")\nmoobar()\n\nrun the file:\nel@apollo:/home/el/foo$ python main.py\nhi\n\nThe function moobar was imported from mylib.py and made available in main.py\n\nExample 3, Use from ... import ... functionality:\n\nPut this in /home/el/foo3/chekov.py:\ndef question():\n print \"where are the nuclear wessels?\"\n\nPut this in /home/el/foo3/main.py:\nfrom chekov import question\nquestion()\n\nRun it like this:\nel@apollo:/home/el/foo3$ python main.py \nwhere are the nuclear wessels?\n\nIf you defined other functions in chekov.py, they would not be available unless you import *\n\nExample 4, Import riaa.py if it's in a different file location from where it is imported\n\nPut this in /home/el/foo4/stuff/riaa.py:\ndef watchout():\n print \"my message\"\n\nPut this in /home/el/foo4/main.py:\nimport sys \nimport os\nsys.path.append(os.path.abspath(\"/home/el/foo4/stuff\"))\nfrom riaa import *\n\nwatchout()\n\nRun it:\nel@apollo:/home/el/foo4$ python main.py \nmy message\n\nThat imports everything in the foreign file from a different directory.\n\nExample 5, Import files in python with the bare import command:\n\nMake a new directory /home/el/foo5/\nMake a new directory /home/el/foo5/herp\nMake an empty file named __init__.py under herp:\nel@apollo:/home/el/foo5/herp$ touch __init__.py\nel@apollo:/home/el/foo5/herp$ ls\n__init__.py\n\nMake a new directory /home/el/foo5/herp/derp\nUnder derp, make another __init__.py file:\nel@apollo:/home/el/foo5/herp/derp$ touch __init__.py\nel@apollo:/home/el/foo5/herp/derp$ ls\n__init__.py\n\nUnder /home/el/foo5/herp/derp make a new file called yolo.py Put this in there:\ndef skycake():\n print \"SkyCake evolves to stay just beyond the cognitive reach of \" +\n \"the bulk of men. SKYCAKE!!\"\n\nThe moment of truth, Make the new file /home/el/foo5/main.py, put this in there;\nfrom herp.derp.yolo import skycake\nskycake()\n\nRun it:\nel@apollo:/home/el/foo5$ python main.py\nSkyCake evolves to stay just beyond the cognitive reach of the bulk \nof men. SKYCAKE!!\n\nThe empty __init__.py file communicates to the python interpreter that the developer intends this directory to be an importable package.\n\nIf you want to see my post on how to include ALL .py files under a directory see here: http://stackoverflow.com/a/20753073/445131\nExample 6, use os.system(\"python yourfile.py\")\nimport os\nos.system(\"python yourfile.py\")\n\nBonus protip\nwhether you are using Mac, Linux or Windows, you need to be using python's idle editor as described here. It will unlock your python world. http://www.youtube.com/watch?v=DkW5CSZ_VII\n"} {"input": "Does Python's time.time() return the local or UTC timestamp? Does time.time() in the Python time module return the system's time or the time in UTC?\n", "output": "The time.time() function returns the number of seconds since the epoch as seconds in UTC.\nHere is some sample output I ran on my computer, converting it to a string as well.\nPython 2.7.3 (default, Apr 24 2012, 00:00:54) \n[GCC 4.7.0 20120414 (prerelease)] on linux2\nType \"help\", \"copyright\", \"credits\" or \"license\" for more information.\n>>> import time\n>>> ts = time.time()\n>>> print ts\n1355563265.81\n>>> import datetime\n>>> st = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')\n>>> print st\n2012-12-15 01:21:05\n>>>\n\nThe ts variable is the time returned in seconds. I then converted it to a string using the datetime library making it a string that is human readable. \n"} {"input": "Split string into a list in Python I want my python function to split a sentence (input) and store each word in a list. The code that I've written so far splits the sentence, but does not store the words as a list. How do I do that?\ndef split_line(text):\n\n # split the text\n words = text.split()\n\n # for each word in the line:\n for word in words:\n\n # print the word\n print(word)\n\n", "output": "Splits the string in text on delimiter: \" \".\nwords = text.split() \n\nSplit the string in text on delimiter: \",\".\nwords = text.split(\",\") \n\nThe words variable will be a list datatype and contain a list of words from text split on the delimiter.\n"} {"input": "What's the proper way to install pip, virtualenv, and distribute for Python? Short Question\n\nWhat is the proper way to install pip, virtualenv, and distribute?\n\nBackground\nIn my answer to SO question 4314376, I recommended using ez_setup so that you could then install pip and virtualenv as follows:\ncurl -O http://peak.telecommunity.com/dist/ez_setup.py\nsudo python ez_setup.py\nsudo easy_install pip\nsudo pip install virtualenv\n\nI originally pulled these instructions from Jesse Noller's blog post So you want to use Python on the Mac?. I like the idea of keeping a clean global site-packages directory, so the only other packages I install there are virtualenvwrapper and distribute. (I recently added distribute to my toolbox because of this Python public service announcement. To install these two packages, I used:\nsudo pip install virtualenvwrapper\ncurl -O http://python-distribute.org/distribute_setup.py\nsudo python distribute_setup.py\n\nNo more setuptools and easy_install\nTo really follow that Python public service announcement, on a fresh Python install, I would do the following:\ncurl -O http://python-distribute.org/distribute_setup.py\nsudo python distribute_setup.py\nsudo easy_install pip\nsudo pip install virtualenv\nsudo pip install virtualenvwrapper\n\nGlyph's Rebuke\nIn a comment to my answer to SO question 4314376, SO user Glyph stated:\n\nNO. NEVER EVER do sudo python setup.py install whatever. Write a ~/.pydistutils.cfg that puts your pip installation into ~/.local or something. Especially files named ez_setup.py tend to suck down newer versions of things like setuptools and easy_install, which can potentially break other things on your operating system.\n\nBack to the short question\nSo Glyph's response leads me to my original question:\n\nWhat is the proper way to install pip, virtualenv, and distribute?\n\n", "output": "You can do this without installing anything into python itself.\nYou don't need sudo or any privileges.\nYou don't need to edit any files.\nInstall virtualenv into a bootstrap virtual environment. Use the that virtual environment to create more. Since virtualenv ships with pip and distribute, you get everything from one install.\n\nDownload virtualenv:\n\nhttp://pypi.python.org/pypi/virtualenv\nhttps://pypi.python.org/packages/source/v/virtualenv/virtualenv-12.0.7.tar.gz\n(or whatever is the latest version!)\n\nUnpack the source tarball\nUse the unpacked tarball to create a clean virtual environment. This virtual environment will be used to \"bootstrap\" others. All of your virtual environments will automatically contain pip and distribute.\nUsing pip, install virtualenv into that bootstrap environment.\nUse that bootstrap environment to create more!\n\nHere is an example in bash:\n# Select current version of virtualenv:\nVERSION=12.0.7\n# Name your first \"bootstrap\" environment:\nINITIAL_ENV=bootstrap\n# Set to whatever python interpreter you want for your first environment:\nPYTHON=$(which python)\nURL_BASE=https://pypi.python.org/packages/source/v/virtualenv\n\n# --- Real work starts here ---\ncurl -O $URL_BASE/virtualenv-$VERSION.tar.gz\ntar xzf virtualenv-$VERSION.tar.gz\n# Create the first \"bootstrap\" environment.\n$PYTHON virtualenv-$VERSION/virtualenv.py $INITIAL_ENV\n# Don't need this anymore.\nrm -rf virtualenv-$VERSION\n# Install virtualenv into the environment.\n$INITIAL_ENV/bin/pip install virtualenv-$VERSION.tar.gz\n\nNow you can use your \"bootstrap\" environment to create more:\n# Create a second environment from the first:\n$INITIAL_ENV/bin/virtualenv py-env1\n# Create more:\n$INITIAL_ENV/bin/virtualenv py-env2\n\nGo nuts!\nNote\nThis assumes you are not using a really old version of virtualenv.\nOld versions required the flags --no-site-packges (and depending on the version of Python, --distribute). Now you can create your bootstrap environment with just python virtualenv.py path-to-bootstrap or python3 virtualenv.py path-to-bootstrap.\n"} {"input": "How to make a Python script standalone executable to run without ANY dependency? I'm building a Python application and don't want to force my clients to install Python and modules. I also want to make my application closed-source.\nSo, is there a way to compile Python scripts to standalone executables?\n", "output": "You can use py2exe as already answered and use cython to convert your key .py files in .pyc, C compiled files, like .dll in Windows and .so in linux, much harder to revert than common .pyo and .pyc files (and also gain in performance!)\n"} {"input": "Why isn't Python very good for functional programming? I have always thought that functional programming can be done in Python. Thus, I was surprised that Python didn't get much of a mention in this question, and when it was mentioned, it normally wasn't very positive. However, not many reasons were given for this (lack of pattern matching and algebraic data types were mentioned). So my question is: why isn't Python very good for functional programming? Are there more reasons than its lack of pattern matching and algebraic data types? Or are these concepts so important to functional programming that a language that doesn't support them can only be classed as a second rate functional programming language? (Keep in mind that my experience with functional programming is quite limited.)\n", "output": "The question you reference asks which languages promote both OO and functional programming. Python does not promote functional programming even though it works fairly well.\nThe best argument against functional programming in Python is that imperative/OO use cases are carefully considered by Guido, while functional programming use cases are not. When I write imperative Python, it's one of the prettiest languages I know. When I write functional Python, it becomes as ugly and unpleasant as your average language that doesn't have a BDFL.\nWhich is not to say that it's bad, just that you have to work harder than you would if you switched to a language that promotes functional programming or switched to writing OO Python.\nHere are the functional things I miss in Python:\n\nPattern matching\nTail recursion\nLarge library of list functions\nFunctional dictionary class\nAutomatic currying\nConcise way to compose functions\nLazy lists\nSimple, powerful expression syntax (Python's simple block syntax prevents Guido from adding it)\n\n\n\nNo pattern matching and no tail recursion mean your basic algorithms have to be written imperatively. Recursion is ugly and slow in Python.\nA small list library and no functional dictionaries mean that you have to write a lot of stuff yourself. \nNo syntax for currying or composition means that point-free style is about as full of punctuation as explicitly passing arguments.\nIterators instead of lazy lists means that you have to know whether you want efficiency or persistence, and to scatter calls to list around if you want persistence. (Iterators are use-once)\nPython's simple imperative syntax, along with its simple LL1 parser, mean that a better syntax for if-expressions and lambda-expressions is basically impossible. Guido likes it this way, and I think he's right.\n\n"} {"input": "How to empty a list in Python? It seems so \"dirty\" emptying a list in this way:\nwhile len(alist) > 0 : alist.pop()\n\nDoes a clear way exist to do that?\n", "output": "This actually removes the contents from the list, not replaces the old label with a new empty list\ndel lst[:]\n\nexample:\nlst1 = [1, 2, 3]\nlst2 = lst1\ndel lst1[:]\nprint(lst2)\n\nFor the sake of completeness, slice assignment achieves the same effect:\nlst[:] = []\n\nand can be used to shrink a part of the list while replacing a part at the same time (but is out of scope of the question).\nNote that doing lst = [] does not empty the list, just creates a new object and binds it to the variable lst, but the old list will still have the same elements, and effect will be apparent if it had other variable bindings.\n"} {"input": "Remove specific characters from a string in python I'm trying to remove specific characters from a string using python. This is the code I'm using right now. Unfortunately it appears to do nothing to the string.\nfor char in line:\n if char in \" ?.!/;:\":\n line.replace(char,'')\n\nHow do I do this properly?\n", "output": "Strings in Python are immutable (can't be changed). Because of this, the effect of line.replace(...) is just to create a new string, rather than changing the old one. You need to rebind (assign) it to line in order to have that variable take the new value, with those characters removed.\nAlso, the way you are doing it is going to be kind of slow, relatively. It's also likely to be a bit confusing to experienced pythonators, who will see a doubly-nested structure and think for a moment that something more complicated is going on.\nStarting in Python 2.6 and newer Python 2.x versions *, you can instead use str.translate, (but read on for Python 3 differences):\nline = line.translate(None, '!@#$')\n\nor regular expression replacement with re.sub\nimport re\nline = re.sub('[!@#$]', '', line)\n\nThe characters enclosed in brackets constitute a character class. Any characters in line which are in that class are replaced with the second parameter to sub: an empty string.\nIn Python 3, strings are Unicode. You'll have to translate a little differently. kevpie mentions this in a comment on one of the answers, and it's noted in the documentation for str.translate.\nWhen calling the translate method of a Unicode string, you cannot pass the second parameter that we used above. You also can't pass None as the first parameter, or even a translation table from string.maketrans. Instead, you pass a dictionary as the only parameter. This dictionary maps the ordinal values of characters (i.e. the result of calling ord on them) to the ordinal values of the characters which should replace them, or\u00e2\u0080\u0094usefully to us\u00e2\u0080\u0094None to indicate that they should be deleted.\nSo to do the above dance with a Unicode string you would call something like\ntranslation_table = dict.fromkeys(map(ord, '!@#$'), None)\nunicode_line = unicode_line.translate(translation_table)\n\nHere dict.fromkeys and map are used to succinctly generate a dictionary containing\n{ord('!'): None, ord('@'): None, ...}\n\nEven simpler, as another answer puts it, create the dictionary in place:\nunicode_line = unicode_line.translate({ord(c): None for c in '!@#$'})\n\n\n* for compatibility with earlier Pythons, you can create a \"null\" translation table to pass in place of None:\nimport string\nline = line.translate(string.maketrans('', ''), '!@#$')\n\nHere string.maketrans is used to create a translation table, which is just a string containing the characters with ordinal values 0 to 255.\n"} {"input": "What is memoization and how can I use it in Python? I just started Python and I've got no idea what memoization is and how to use it. Also, may I have a simplified example?\n", "output": "Memoization effectively refers to remembering (\"memoization\" -> \"memorandum\" -> to be remembered) results of method calls based on the method inputs and then returning the remembered result rather than computing the result again. You can think of it as a cache for method results. For further details, see page 387 for the definition in Introduction To Algorithms (3e), Cormen et al.\nA simple example for computing factorials using memoization in Python would be something like this:\nfactorial_memo = {}\ndef factorial(k):\n if k < 2: return 1\n if k not in factorial_memo:\n factorial_memo[k] = k * factorial(k-1)\n return factorial_memo[k]\n\nYou can get more complicated and encapsulate the memoization process into a class\nclass Memoize:\n def __init__(self, f):\n self.f = f\n self.memo = {}\n def __call__(self, *args):\n if not args in self.memo:\n self.memo[args] = self.f(*args)\n return self.memo[args]\n\nThen:\ndef factorial(k):\n if k < 2: return 1\n return k * factorial(k - 1)\n\nfactorial = Memoize(factorial)\n\n"} {"input": "How to overcome \"datetime.datetime not JSON serializable\" in python? I have a basic dict as follows:\nsample = {}\nsample['title'] = \"String\"\nsample['somedate'] = somedatetimehere\n\nWhen I try to do jsonify(sample) I get:\n\nTypeError: datetime.datetime(2012, 8, 8, 21, 46, 24, 862000) is not JSON serializable\n\nWhat can I do such that my dictionary sample can overcome the error above?\nNote: Though it may not be relevant, the dictionaries are generated from the retrieval of records out of mongodb where when I print out str(sample['somedate']), the output is 2012-08-08 21:46:24.862000.\n", "output": "Building on other answers, a simple solution based on a specific serializer that just converts datetime.datetime objects to strings.\nfrom datetime import datetime\n\ndef json_serial(obj):\n \"\"\"JSON serializer for objects not serializable by default json code\"\"\"\n\n if isinstance(obj, datetime):\n serial = obj.isoformat()\n return serial\n raise TypeError (\"Type not serializable\")\n\nAs seen, the code just checks to find out if object is of class datetime.datetime, and then uses .isoformat() to produce a serialized version of it, according to ISO 8601 format, YYYY-MM-DDTHH:MM:SS (which is easily decoded by JavaScript). If more complex serialized representations are sought, other code could be used instead of str() (see other answers to this question for examples). The code ends by raising an exception, to deal with the case it is called with a non-serializable type which is not datetime.\nThis json_serial function can be used as follows:\nfrom datetime import datetime\nfrom json import dumps\n\nprint dumps(datetime.now(), default=json_serial)\n\nThe details about how the default parameter to json.dumps works can be found in Section Basic Usage of the json module documentation. \n"} {"input": "Python Print String To Text File I'm using Python to open a text document:\ntext_file = open(\"Output.txt\", \"w\")\n\ntext_file.write(\"Purchase Amount: \" 'TotalAmount')\n\ntext_file.close()\n\nI want to enter the string called \"TotalAmount\" into the text document. Can someone please let me know how to do this?\n", "output": "text_file = open(\"Output.txt\", \"w\")\ntext_file.write(\"Purchase Amount: %s\" % TotalAmount)\ntext_file.close()\n\nIf you use a context manager, the file is closed automatically for you\nwith open(\"Output.txt\", \"w\") as text_file:\n text_file.write(\"Purchase Amount: %s\" % TotalAmount)\n\nIf you're using Python2.6 or higher, it's preferred to use str.format()\nwith open(\"Output.txt\", \"w\") as text_file:\n text_file.write(\"Purchase Amount: {0}\".format(TotalAmount))\n\nFor python2.7 and higher you can use {} instead of {0}\nIn Python3, there is an optional file parameter to the print function\nwith open(\"Output.txt\", \"w\") as text_file:\n print(\"Purchase Amount: {}\".format(TotalAmount), file=text_file)\n\n"} {"input": "Writing a list to a file with Python Is this the cleanest way to write a list to a file, since writelines() doesn't insert newline characters?\nfile.writelines([\"%s\\n\" % item for item in list])\n\nIt seems like there would be a standard way...\n", "output": "\nEDIT Adding info from Thomas' comment\nDon't forget to open the file first\nthefile = open('test.txt', 'w')\n\nI'd use a loop:\nfor item in thelist:\n thefile.write(\"%s\\n\" % item)\n\nor:\nfor item in thelist:\n print>>thefile, item\n\nIf you're keen on a single function call, at least remove the square brackets [] so that the strings to be printed get made one at a time (a genexp rather than a listcomp) -- no reason to take up all the memory required to materialize the whole list of strings.\n"} {"input": "Python, extract file name from path, no matter what the os/path format Which Python library can I use to extract filenames from paths, no matter what the operating system or path format could be?\nFor example, I'd like all of these paths to return me c:\na/b/c/\na/b/c\n\\a\\b\\c\n\\a\\b\\c\\\na\\b\\c\na/b/../../a/b/c/\na/b/../../a/b/c\n\n", "output": "Actually, there's a function that returns exactly what you want\nprint os.path.basename(your_path)\n\n"} {"input": "What's the difference between a Python module and a Python package? What's the difference between a Python module and a Python package?\nSee also: What's the difference between \"package\" and \"module\" (for other languages)\n", "output": "Any Python file is a module, its name being the file's base name without the .py extension. A package is a collection of Python modules: while a module is a single Python file, a package is a directory of Python modules containing an additional __init__.py file, to distinguish a package from a directory that just happens to contain a bunch of Python scripts. Packages can be nested to any depth, provided that the corresponding directories contain their own __init__.py file.\nThe distinction between module and package seems to hold just at the file system level. When you import a module or a package, the corresponding object created by Python is always of type module. Note, however, when you import a package, only variables/functions/classes in the __init__.py file of that package are directly visible, not sub-packages or modules. As an example, consider the xml package in the Python standard library: its xml directory contains an __init__.py file and four sub-directories; the sub-directory etree contains an __init__.py file and, among others, an ElementTree.py file. See what happens when you try to interactively import package/modules:\n>>> import xml\n>>> type(xml)\n\n>>> xml.etree.ElementTree\nTraceback (most recent call last):\n File \"\", line 1, in \nAttributeError: 'module' object has no attribute 'etree'\n>>> import xml.etree\n>>> type(xml.etree)\n\n>>> xml.etree.ElementTree\nTraceback (most recent call last):\n File \"\", line 1, in \nAttributeError: 'module' object has no attribute 'ElementTree'\n>>> import xml.etree.ElementTree\n>>> type(xml.etree.ElementTree)\n\n>>> xml.etree.ElementTree.parse\n\n\nIn Python there also are built-in modules, such as sys, that are written in C, but I don't think you meant to consider those in your question.\n"} {"input": "Cannot install Lxml on Mac os x 10.9 I want to install Lxml so I can then install Scrapy.\nWhen I updated my Mac today it wouldn't let me reinstall lxml, I get the following error:\nIn file included from src/lxml/lxml.etree.c:314:\n/private/tmp/pip_build_root/lxml/src/lxml/includes/etree_defs.h:9:10: fatal error: 'libxml/xmlversion.h' file not found\n#include \"libxml/xmlversion.h\"\n ^\n1 error generated.\nerror: command 'cc' failed with exit status 1\n\nI have tried using brew to install libxml2 and libxslt, both installed fine but I still cannot install lxml.\nLast time I was installing I needed to enable the developer tools on Xcode but since its updated to Xcode 5 it doesnt give me that option anymore.\nDoes anyone know what I need to do?\n", "output": "You should install or upgrade the commandline tool for xcode.\nTry this in a terminal:\nxcode-select --install\n\n"} {"input": "Generator Expressions vs. List Comprehension When should you use generator expressions and when should you use list comprehensions in Python?\n# Generator expression\n(x*2 for x in range(256))\n\n# List comprehension\n[x*2 for x in range(256)]\n\n", "output": "John's answer is good (that list comprehensions are better when you want to iterate over something multiple times). However, it's also worth noting that you should use a list if you want to use any of the list methods. For example, the following code won't work:\ndef gen():\n return (something for something in get_some_stuff())\n\nprint gen()[:2] # generators don't support indexing or slicing\nprint [5,6] + gen() # generators can't be added to lists\n\nBasically, use a generator expression if all you're doing is iterating once. If you want to store and use the generated results, then you're probably better off with a list comprehension.\nSince performance is the most common reason to choose one over the other, my advice is to not worry about it and just pick one; if you find that your program is running too slowly, then and only then should you go back and worry about tuning your code.\n"} {"input": "How to calculate the angle between a line and the horizontal axis? In a programming language (Python, C#, etc) I need to determine how to calculate the angle between a line and the horizontal axis?\nI think an image describes best what I want:\n\nGiven (P1x,P1y) and (P2x,P2y) what is the best way to calculate this angle? The origin is in the topleft and only the positive quadrant is used.\n", "output": "First find the difference between the start point and the end point.\ndeltaY = P2_y - P1_y\ndeltaX = P2_x - P1_x\n\nThen calculate the angle.\nangleInDegrees = arctan(deltaY / deltaX) * 180 / PI\n\nIf your language includes an atan2 function it becomes the following instead:\nangleInDegrees = atan2(deltaY, deltaX) * 180 / PI\n\nAn implementation in Python using radians (provided by Eric Leschinski, who edited my answer):\nfrom math import *\ndef angle_trunc(a):\n while a < 0.0:\n a += pi * 2\n return a\n\ndef getAngleBetweenPoints(x_orig, y_orig, x_landmark, y_landmark):\n deltaY = y_landmark - y_orig\n deltaX = x_landmark - x_orig\n return angle_trunc(atan2(deltaY, deltaX))\n\nangle = getAngleBetweenPoints(5, 2, 1,4)\nassert angle >= 0, \"angle must be >= 0\"\nangle = getAngleBetweenPoints(1, 1, 2, 1)\nassert angle == 0, \"expecting angle to be 0\"\nangle = getAngleBetweenPoints(2, 1, 1, 1)\nassert abs(pi - angle) <= 0.01, \"expecting angle to be pi, it is: \" + str(angle)\nangle = getAngleBetweenPoints(2, 1, 2, 3)\nassert abs(angle - pi/2) <= 0.01, \"expecting angle to be pi/2, it is: \" + str(angle)\nangle = getAngleBetweenPoints(2, 1, 2, 0)\nassert abs(angle - (pi+pi/2)) <= 0.01, \"expecting angle to be pi+pi/2, it is: \" + str(angle)\nangle = getAngleBetweenPoints(1, 1, 2, 2)\nassert abs(angle - (pi/4)) <= 0.01, \"expecting angle to be pi/4, it is: \" + str(angle)\nangle = getAngleBetweenPoints(-1, -1, -2, -2)\nassert abs(angle - (pi+pi/4)) <= 0.01, \"expecting angle to be pi+pi/4, it is: \" + str(angle)\nangle = getAngleBetweenPoints(-1, -1, -1, 2)\nassert abs(angle - (pi/2)) <= 0.01, \"expecting angle to be pi/2, it is: \" + str(angle)\n\nAll tests pass. See https://en.wikipedia.org/wiki/Unit_circle\n"} {"input": "Convert list to tuple in Python I'm trying to convert a list to a tuple. \nWhen I google it, I find a lot of answers similar to:\nl = [4,5,6]\ntuple(l)\n\nBut if I do that I get this error message:\n\nTypeError: 'tuple' object is not callable\n\nHow can I fix this problem?\n", "output": "It should work fine. Don't use tuple, list or other special names as a variable name. It's probably what's causing your problem.\n>>> l = [4,5,6]\n>>> tuple(l)\n(4, 5, 6)\n\n"} {"input": "No module named MySQLdb I am using Python version 2.5.4 and install MySQL version 5.0 and Django. Django is working fine with Python, but not MySQL. Can anyone help me to solve this issue? I am using it in Vista.\n", "output": "You need to use one of the following commands. Which one depends on what OS and software you have and use.\n\neasy_install mysql-python (mix os)\npip install mysql-python (mix os)\napt-get install python-mysqldb (Linux Ubuntu, ...)\ncd /usr/ports/databases/py-MySQLdb && make install clean (FreeBSD)\nyum install MySQL-python (Linux Fedora, CentOS ...)\n\nFor Windows, see this answer: Install mysql-python (Windows)\n"} {"input": "Python \"extend\" for a dictionary Which is the best way to extend a dictionary with another one? For instance:\n>>> a = { \"a\" : 1, \"b\" : 2 }\n>>> b = { \"c\" : 3, \"d\" : 4 }\n>>> a\n{'a': 1, 'b': 2}\n>>> b\n{'c': 3, 'd': 4}\n\nI'm looking for any operation to obtain this avoiding for loop:\n{ \"a\" : 1, \"b\" : 2, \"c\" : 3, \"d\" : 4 }\n\nI wish to do something like:\na.extend(b) # This does not work\n\n", "output": "a.update(b)\n\nPython Standard Library Documentation\n"} {"input": "How to convert strings into integers in Python? I have a tuple of tuples from a MySQL query like this:\nT1 = (('13', '17', '18', '21', '32'),\n ('07', '11', '13', '14', '28'),\n ('01', '05', '06', '08', '15', '16'))\n\nI'd like to convert all the string elements into integers and put it back nicely to list of lists this time:\nT2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]\n\nI tried to achieve it with eval but didn't get any decent result yet.\n", "output": "int() is the Python standard built-in function to convert a string into an integer value. You call it with a string containing a number as the argument, and it returns the number converted to an actual integer:\nprint int(\"1\") + 1\n\nThe above prints 2.\nIf you know the structure of your list (that it simply contains lists, only one level), you could do this:\nT2 = [map(int, x) for x in T1]\n\n"} {"input": "How do I sort a list of strings in Python? What is the best way of creating an alphabetically sorted list in Python?\n", "output": "Basic answer:\nmylist = [\"b\", \"C\", \"A\"]\nmylist.sort()\n\nThis modifies your original list (i.e. sorts in-place). To get a sorted copy of the list, without changing the original, use the sorted() function:\nfor x in sorted(mylist):\n print x\n\nHowever, the examples above are a bit naive, because they don't take locale into account, and perform a case-sensitive sorting. You can take advantage of the optional parameter key to specify custom sorting order (the alternative, using cmp, is a deprecated solution, as it has to be evaluated multiple times - key is only computed once per element).\nSo, to sort according to the current locale, taking language-specific rules into account (cmp_to_key is a helper function from functools):\nsorted(mylist, key=cmp_to_key(locale.strcoll))\n\nAnd finally, if you need, you can specify a custom locale for sorting:\nimport locale\nlocale.setlocale(locale.LC_ALL, 'en_US.UTF-8') # vary depending on your lang/locale\nassert sorted((u'Ab', u'ad', u'aa'),\n key=cmp_to_key(locale.strcoll)) == [u'aa', u'Ab', u'ad']\n\nLast note: you will see examples of case-insensitive sorting which use the lower() method - those are incorrect, because they work only for the ASCII subset of characters. Those two are wrong for any non-English data:\n# this is incorrect!\nmylist.sort(key=lambda x: x.lower())\n# alternative notation, a bit faster, but still wrong\nmylist.sort(key=str.lower)\n\n"} {"input": "Chain-calling parent constructors in python Consider this - a base class A, class B inheriting from A, class C inheriting from B. What is a generic way to call a parent class constructor in a constructor? If this still sounds too vague, here's some code.\nclass A(object):\n def __init__(self):\n print \"Constructor A was called\"\n\nclass B(A):\n def __init__(self):\n super(B,self).__init__()\n print \"Constructor B was called\"\n\nclass C(B):\n def __init__(self):\n super(C,self).__init__()\n print \"Constructor C was called\"\n\nc = C()\n\nThis is how I do it now. But it still seems a bit too non-generic - you still must pass a correct type by hand.\nNow, I've tried using self.__class__ as a first argument to super(), but, obviously it doesn't work - if you put it in the constructor for C - fair enough, B's constructor gets called. If you do the same in B, \"self\" still points to an instance of C so you end up calling B's constructor again (this ends in an infinite recursion).\nThere is no need to think about diamond inheritance for now, I am just interested in solving this specific problem.\n", "output": "The way you are doing it is indeed the recommended one (for Python 2.x). \nThe issue of whether the class is passed explicitly to super is a matter of style rather than functionality. Passing the class to super fits in with Python's philosophy of \"explicit is better than implicit\".\n"} {"input": "Python: Find in list I have come across this delightful:\nitem = someSortOfSelection()\nif item in myList:\n doMySpecialFunction(item)\n\nbut sometimes it does not work with all my items, as if they weren't recognized in the list (when it's a list of string).\nis it the most 'pythonic' way of finding an item in a list: if x in l:?\n", "output": "As for your first question: that code is perfectly fine and should work if item equals one of the elements inside myList. Maybe you try to find a string that does not exactly match one of the items or maybe you are using a float value which suffers from inaccuracy.\nAs for your second question: There's actually several possible ways if \"finding\" things in lists.\nChecking if something is inside\nThis is the use case you describe: Checking whether something is inside a list or not. As you know, you can use the in operator for that:\n3 in [1, 2, 3] # => True\n\nFiltering a collection\nThat is, finding all elements in a sequence that meet a certain condition. You can use list comprehension or generator expressions for that:\nmatches = [x for x in lst if fulfills_some_condition(x)]\nmatches = (x for x in lst if x > 6)\n\nThe latter will return a generator which you can imagine as a sort of lazy list that will only be built as soon as you iterate through it. By the way, the first one is exactly equivalent to\nmatches = filter(fulfills_some_condition, lst)\n\nin Python 2. Here you can see higher-order functions at work. In Python 3, filter doesn't return a list, but a generator-like object.\nFinding the first occurrence\nIf you only want the first thing that matches a condition (but you don't know what it is yet), it's fine to use a for loop (possibly using the else clause as well, which is not really well-known). You can also use\nnext(x for x in lst if ...)\n\nwhich will return the first match or raise a StopIteration if none is found. Alternatively, you can use\nnext((x for x in lst if ...), [default value])\n\nFinding the location of an item\nFor lists, there's also the index method that can sometimes be useful if you want to know where a certain element is in the list:\n[1,2,3].index(2) # => 1\n[1,2,3].index(4) # => ValueError\n\nHowever, note that if you have duplicates, .index always returns the lowest index:\n[1,2,3,2].index(2) # => 1\n\nIf there are duplicates and you want all the indexes then you can use enumerate() instead:\n[i for i,x in enumerate([1,2,3,2]) if x==2] # => [1, 3]\n\n"} {"input": "Difference between __getattr__ vs __getattribute__ I am trying to understand when to use __getattr__ or __getattribute__. The documentation mentions __getattribute__ applies to new-style classes. What are new-style classes? \n", "output": "A key difference between __getattr__ and __getattribute__ is that __getattr__ is only invoked if the attribute wasn't found the usual ways. It's good for implementing a fallback for missing attributes, and is probably the one of two you want.\n__getattribute__ is invoked before looking at the actual attributes on the object, and so can be tricky to implement correctly. You can end up in infinite recursions very easily.\nNew-style classes derive from object, old-style classes are those in Python 2.x with no explicit base class. But the distinction between old-style and new-style classes is not the important one when choosing between __getattr__ and __getattribute__.\nYou almost certainly want __getattr__.\n"} {"input": "C-like structures in Python Is there a way to conveniently define a C-like structure in Python? I'm tired of writing stuff like:\nclass MyStruct():\n def __init__(self, field1, field2, field3):\n self.field1 = field1\n self.field2 = field2\n self.field3 = field3\n\n", "output": "Use a named tuple, which was added to the collections module in the standard library in Python 2.6. It's also possible to use Raymond Hettinger's named tuple recipe if you need to support Python 2.4.\nIt's nice for your basic example, but also covers a bunch of edge cases you might run into later as well. Your fragment above would be written as:\nfrom collections import namedtuple\nMyStruct = namedtuple(\"MyStruct\", \"field1 field2 field3\")\n\nThe newly created type can be used like this:\nm = MyStruct(\"foo\", \"bar\", \"baz\")\n\nOr you can use named arguments:\nm = MyStruct(field1 = \"foo\", field2 = \"bar\", field3 = \"baz\")\n\n"} {"input": "Separation of business logic and data access in django I am writing a project in Django and I see that 80% of the code is in the file models.py. This code is confusing and, after a certain time, I cease to understand what is really happening.\nHere is what bothers me:\n\nI find it ugly that my model level (which was supposed to be\nresponsible only for the work with data from a database) is also\nsending email, walking on api to other services, etc. \nAlso, I find it unacceptable to place business logic in the view, because\nthis way it becomes difficult to control. For example, in my\napplication there are at least three ways to create new\ninstances of User, but technically it should create them uniformally. \nI do not always notice when the methods and\nproperties of my models become non-deterministic and when they develop\nside effects.\n\nHere is a simple example. At first, the User model was like this:\nclass User(db.Models):\n\n def get_present_name(self):\n return self.name or 'Anonymous'\n\n def activate(self):\n self.status = 'activated'\n self.save()\n\nOver time, it turned into this:\nclass User(db.Models):\n\n def get_present_name(self): \n # property became non-deterministic in terms of database\n # data is taken from another service by api\n return remote_api.request_user_name(self.uid) or 'Anonymous' \n\n def activate(self):\n # method now has a side effect (send message to user)\n self.status = 'activated'\n self.save()\n send_mail('Your account is activated!', '\u00e2\u0080\u00a6', [self.email])\n\nWhat I want is to separate entities in my code:\n\nEntities of my database, database level: What contains my application?\nEntities of my application, business logic level: What can make my application?\n\nWhat are the good practices to implement such an approach that can be applied in Django?\n", "output": "It seems like you are asking about the difference between the data model and the domain model \u00e2\u0080\u0093\u00c2\u00a0the latter is where you can find the business logic and entities as perceived by your end user, the former is where you actually store your data. \nFurthermore, I've interpreted the 3rd part of your question as: how to notice failure to keep these models separate. \nThese are two very different concepts and it's always hard to keep them separate. However, there are some common patterns and tools that can be used for this purpose. \nAbout the Domain Model\nThe first thing you need to recognize is that your domain model is not really about data; it is about actions and questions such as \"activate this user\", \"deactivate this user\", \"which users are currently activated?\", and \"what is this user's name?\". In classical terms: it's about queries and commands. \nThinking in Commands\nLet's start by looking at the commands in your example: \"activate this user\" and \"deactivate this user\". The nice thing about commands is that they can easily be expressed by small given-when-then scenario's: \n\ngiven an inactive user \nwhen the admin activates this user \nthen the user becomes active \nand a confirmation e-mail is sent to the user \nand an entry is added to the system log\n (etc. etc.)\n\nSuch scenario's are useful to see how different parts of your infrastructure can be affected by a single command \u00e2\u0080\u0093\u00c2\u00a0in this case your database (some kind of 'active' flag), your mail server, your system log, etc. \nSuch scenario's also really help you in setting up a Test Driven Development environment. \nAnd finally, thinking in commands really helps you create a task-oriented application. Your users will appreciate this :-)\nExpressing Commands\nDjango provides two easy ways of expressing commands; they are both valid options and it is not unusual to mix the two approaches. \nThe service layer\nThe service module has already been described by @Hedde. Here you define a separate module and each command is represented as a function. \nservices.py\ndef activate_user(user_id):\n user = User.objects.get(pk=user_id)\n\n # set active flag\n user.active = True\n user.save()\n\n # mail user\n send_mail(...)\n\n # etc etc\n\nUsing forms\nThe other way is to use a Django Form for each command. I prefer this approach, because it combines multiple closely related aspects:\n\nexecution of the command (what does it do?)\nvalidation of the command parameters (can it do this?)\npresentation of the command (how can I do this?)\n\nforms.py\nclass ActivateUserForm(forms.Form):\n\n user_id = IntegerField(widget = UsernameSelectWidget, verbose_name=\"Select a user to activate\")\n # the username select widget is not a standard Django widget, I just made it up\n\n def clean_user_id(self):\n user_id = self.cleaned_data['user_id']\n if User.objects.get(pk=user_id).active:\n raise ValidationError(\"This user cannot be activated\")\n # you can also check authorizations etc. \n return user_id\n\n def execute(self):\n \"\"\"\n This is not a standard method in the forms API; it is intended to replace the \n 'extract-data-from-form-in-view-and-do-stuff' pattern by a more testable pattern. \n \"\"\"\n user_id = self.cleaned_data['user_id']\n\n user = User.objects.get(pk=user_id)\n\n # set active flag\n user.active = True\n user.save()\n\n # mail user\n send_mail(...)\n\n # etc etc\n\nThinking in Queries\nYou example did not contain any queries, so I took the liberty of making up a few useful queries. I prefer to use the term \"question\", but queries is the classical terminology. Interesting queries are: \"What is the name of this user?\", \"Can this user log in?\", \"Show me a list of deactivated users\", and \"What is the geographical distribution of deactivated users?\" \nBefore embarking on answering these queries, you should always ask yourself two questions: is this a presentational query just for my templates, and/or a business logic query tied to executing my commands, and/or a reporting query. \nPresentational queries are merely made to improve the user interface. The answers to business logic queries directly affect the execution of your commands. Reporting queries are merely for analytical purposes and have looser time constraints. These categories are not mutually exclusive. \nThe other question is: \"do I have complete control over the answers?\" For example, when querying the user's name (in this context) we do not have any control over the outcome, because we rely on an external API. \nMaking Queries\nThe most basic query in Django is the use of the Manager object: \nUser.objects.filter(active=True)\n\nOf course, this only works if the data is actually represented in your data model. This is not always the case. In those cases, you can consider the options below. \nCustom tags and filters\nThe first alternative is useful for queries that are merely presentational: custom tags and template filters. \ntemplate.html\n

Welcome, {{ user|friendly_name }}

\n\ntemplate_tags.py\n@register.filter\ndef friendly_name(user):\n return remote_api.get_cached_name(user.id)\n\nQuery methods\nIf your query is not merely presentational, you could add queries to your services.py (if you are using that), or introduce a queries.py module: \nqueries.py\ndef inactive_users():\n return User.objects.filter(active=False)\n\n\ndef users_called_publysher():\n for user in User.objects.all():\n if remote_api.get_cached_name(user.id) == \"publysher\":\n yield user \n\nProxy models\nProxy models are very useful in the context of business logic and reporting. You basically define an enhanced subset of your model. \nmodels.py\nclass InactiveUserManager(models.Manager):\n def get_query_set(self):\n query_set = super(InactiveUserManager, self).get_query_set()\n return query_set.filter(active=False)\n\nclass InactiveUser(User):\n \"\"\"\n >>> for user in InactiveUser.objects.all():\n \u00e2\u0080\u00a6 assert user.active is False \n \"\"\"\n\n objects = InactiveUserManager()\n class Meta:\n proxy = True\n\nQuery models\nFor queries that are inherently complex, but are executed quite often, there is the possibility of query models. A query model is a form of denormalization where relevant data for a single query is stored in a separate model. The trick of course is to keep the denormalized model in sync with the primary model. Query models can only be used if changes are entirely under your control. \nmodels.py\nclass InactiveUserDistribution(models.Model):\n country = CharField(max_length=200)\n inactive_user_count = IntegerField(default=0)\n\nThe first option is to update these models in your commands. This is very useful if these models are only changed by one or two commands. \nforms.py\nclass ActivateUserForm(forms.Form):\n # see above\n\n def execute(self):\n # see above\n query_model = InactiveUserDistribution.objects.get_or_create(country=user.country)\n query_model.inactive_user_count -= 1\n query_model.save()\n\nA better option would be to use custom signals. These signals are of course emitted by your commands. Signals have the advantage that you can keep multiple query models in sync with your original model. Furthermore, signal processing can be offloaded to background tasks, using Celery or similar frameworks. \nsignals.py\nuser_activated = Signal(providing_args = ['user'])\nuser_deactivated = Signal(providing_args = ['user'])\n\nforms.py\nclass ActivateUserForm(forms.Form):\n # see above\n\n def execute(self):\n # see above\n user_activated.send_robust(sender=self, user=user)\n\nmodels.py\nclass InactiveUserDistribution(models.Model):\n # see above\n\n@receiver(user_activated)\ndef on_user_activated(sender, **kwargs):\n user = kwargs['user']\n query_model = InactiveUserDistribution.objects.get_or_create(country=user.country)\n query_model.inactive_user_count -= 1\n query_model.save()\n\nKeeping it clean\nWhen using this approach, it becomes ridiculously easy to determine if your code stays clean. Just follow these guidelines:\n\nDoes my model contain methods that do more than managing database state? You should extract a command. \nDoes my model contain properties that do not map to database fields? You should extract a query. \nDoes my model reference infrastructure that is not my database (such as mail)? You should extract a command.\n\nThe same goes for views (because views often suffer from the same problem). \n\nDoes my view actively manage database models? You should extract a command.\n\nSome References\nDjango documentation: proxy models\nDjango documentation: signals\nArchitecture: Domain Driven Design\n"} {"input": "What is a monkey patch? I am trying to understand, what is a monkey patch? \nIs that something like methods/operators overloading or delegating? \nDoes it have anything common with these things?\n", "output": "No, it's not like any of those things. It's simply the dynamic replacement of attributes at runtime.\nFor instance, consider a class that has a method get_data. This method does an external lookup (on a database or web API, for example), and various other methods in the class call it. However, in a unit test, you don't want to depend on the external data source - so you dynamically replace the get_data method with a stub that returns some fixed data.\nBecause Python classes are mutable, and methods are just attributes of the class, you can do this as much as you like - and, in fact, you can even replace classes and functions in a module in exactly the same way.\n"} {"input": "Tabs versus spaces in Python programming I have always used tabs for indentation when I do Python programming. But then I came across a question here on SO where someone pointed out that most Python programmers use spaces instead of tabs to minimize editor-to-editor mistakes.\nHow does that make a difference? Are there other reasons why one would use spaces instead of tabs for Python? Or is it simply not true?\nShould I switch my editor to insert spaces instead of tabs right away or keep on going like I used to?\n", "output": "Tired of chasing after indentation typos ( 8 spaces ? no, 7 oops 9 ... ) I switched my sources to 'tabs only'.\n1 tab == 1 indent level, full stop\nThe point is: if you want to display the indentation as 4, 8 or pi / 12 character width, just change the settings in your text editor, don't mess with the code :p\n(personally I use 4 char width tab... but some would prefer 3 or 8 space, or even use variable width fonts !!)\n"} {"input": "What do the python file extensions, .pyc .pyd .pyo stand for? What do these python file extensions mean?\n\n.pyc\n.pyd\n.pyo\n\nWhat are the differences between them and how are they generated from a *.py file?\n", "output": "\n.py: This is normally the input source code that you've written.\n.pyc: This is the compiled bytecode. If you import a module, python will build a *.pyc file that contains the bytecode to make importing it again later easier (and faster).\n.pyo: This is a *.pyc file that was created while optimizations (-O) was on.\n.pyd: This is basically a windows dll file. http://docs.python.org/faq/windows.html#is-a-pyd-file-the-same-as-a-dll\n\nAlso for some further discussion on .pyc vs .pyo, take a look at: http://www.network-theory.co.uk/docs/pytut/CompiledPythonfiles.html (I've copied the important part below)\n\n\nWhen the Python interpreter is invoked with the -O flag, optimized code is generated and stored in \u00e2\u0080\u0098.pyo\u00e2\u0080\u0099 files. The optimizer currently doesn't help much; it only removes assert statements. When -O is used, all bytecode is optimized; .pyc files are ignored and .py files are compiled to optimized bytecode.\nPassing two -O flags to the Python interpreter (-OO) will cause the bytecode compiler to perform optimizations that could in some rare cases result in malfunctioning programs. Currently only __doc__ strings are removed from the bytecode, resulting in more compact \u00e2\u0080\u0098.pyo\u00e2\u0080\u0099 files. Since some programs may rely on having these available, you should only use this option if you know what you're doing.\nA program doesn't run any faster when it is read from a \u00e2\u0080\u0098.pyc\u00e2\u0080\u0099 or \u00e2\u0080\u0098.pyo\u00e2\u0080\u0099 file than when it is read from a \u00e2\u0080\u0098.py\u00e2\u0080\u0099 file; the only thing that's faster about \u00e2\u0080\u0098.pyc\u00e2\u0080\u0099 or \u00e2\u0080\u0098.pyo\u00e2\u0080\u0099 files is the speed with which they are loaded.\nWhen a script is run by giving its name on the command line, the bytecode for the script is never written to a \u00e2\u0080\u0098.pyc\u00e2\u0080\u0099 or \u00e2\u0080\u0098.pyo\u00e2\u0080\u0099 file. Thus, the startup time of a script may be reduced by moving most of its code to a module and having a small bootstrap script that imports that module. It is also possible to name a \u00e2\u0080\u0098.pyc\u00e2\u0080\u0099 or \u00e2\u0080\u0098.pyo\u00e2\u0080\u0099 file directly on the command line.\n\n\n"} {"input": "Can I use Python as a Bash replacement? I currently do my textfile manipulation through a bunch of badly remembered AWK, sed, Bash and a tiny bit of Perl.\nI've seen mentioned a few places that python is good for this kind of thing, I know a little and I would like to know more. Is Python a good choice for this, and is there a good book or guide to learning how to use Python to replace shell scripting, AWK, sed and friends?\n", "output": "Any shell has several sets of features.\n\nThe Essential Linux/Unix commands. All of these are available through the subprocess library. This isn't always the best first choice for doing all external commands. Look also at shutil for some commands that are separate Linux commands, but you could probably implement directly in your Python scripts. Another huge batch of Linux commands are in the os library; you can do these more simply in Python.\nAnd -- bonus! -- more quickly. Each separate Linux command in the shell (with a few exceptions) forks a subprocess. By using Python shutil and os modules, you don't fork a subprocess.\nThe shell environment features. This includes stuff that sets a command's environment (current directory and environment variables and what-not). You can easily manage this from Python directly.\nThe shell programming features. This is all the process status code checking, the various logic commands (if, while, for, etc.) the test command and all of it's relatives. The function definition stuff. This is all much, much easier in Python. This is one of the huge victories in getting rid of bash and doing it in Python.\nInteraction features. This includes command history and what-not. You don't need this for writing shell scripts. This is only for human interaction, and not for script-writing.\nThe shell file management features. This includes redirection and pipelines. This is trickier. Much of this can be done with subprocess. But some things that are easy in the shell are unpleasant in Python. Specifically stuff like (a | b; c ) | something >result. This runs two processes in parallel (with output of a as input to b), followed by a third process. The output from that sequence is run in parallel with something and the output is collected into a file named result. That's just complex to express in any other language.\n\nSpecific programs (awk, sed, grep, etc.) can often be rewritten as Python modules. Don't go overboard. Replace what you need and evolve your \"grep\" module. Don't start out writing a Python module that replaces \"grep\".\nThe best thing is that you can do this in steps.\n\nReplace AWK and PERL with Python. Leave everything else alone.\nLook at replacing GREP with Python. This can be a bit more complex, but your version of GREP can be tailored to your processing needs.\nLook at replacing FIND with Python loops that use os.walk. This is a big win because you don't spawn as many processes.\nLook at replacing common shell logic (loops, decisions, etc.) with Python scripts.\n\n"} {"input": "How do I execute a program from python? os.system fails due to spaces in path I have a python script that needs to execute an external program, but for some reason fails.\nIf I have the following script:\nimport os;\nos.system(\"C:\\\\Temp\\\\a b c\\\\Notepad.exe\");\nraw_input();\n\nThen it fails with the following error:\n'C:\\Temp\\a' is not recognized as an internal or external command, operable program or batch file.\n\nIf I escape the program with quotes:\nimport os;\nos.system('\"C:\\\\Temp\\\\a b c\\\\Notepad.exe\"');\nraw_input();\n\nThen it works. However, if I add a parameter, it stops working again:\nimport os;\nos.system('\"C:\\\\Temp\\\\a b c\\\\Notepad.exe\" \"C:\\\\test.txt\"');\nraw_input();\n\nWhat is the right way to execute a program and wait for it to complete? I do not need to read output from it, as it is a visual program that does a job and then just exits, but I need to wait for it to complete.\nAlso note, moving the program to a non-spaced path is not an option either.\n\nEdit This does not work either:\nimport os;\nos.system(\"'C:\\\\Temp\\\\a b c\\\\Notepad.exe'\");\nraw_input();\n\nNote the swapped single/double quotes.\nwith or without a parameter to notepad here, it fails with the error message\nThe filename, directory name, or volume label syntax is incorrect.\n\n", "output": "subprocess.call will avoid problems with having to deal with quoting conventions of various shells. It accepts a list, rather than a string, so arguments are more easily delimited. i.e.\nimport subprocess\nsubprocess.call(['C:\\\\Temp\\\\a b c\\\\Notepad.exe', 'C:\\\\test.txt'])\n\n"} {"input": "Adding new column to existing DataFrame in Python pandas I have a DataFrame with named columns and rows indexed with not- continuous numbers like from the code:\ndf1 = DataFrame(np.random.randn(10, 4), columns=['a', 'b', 'c', 'd'])\nmask = df1.applymap(lambda x: x <-0.7)\ndf1 = df1[-mask.any(axis=1)]\nsLength = len(df1['a'])\ne = Series(np.random.randn(sLength))\n\nI would like to add a new column, 'e', to the existing data frame and do not change anything in the data frame. (The series always got the same length as a dataframe.) I tried different versions of join, append, merge, but I did not get it as what I want, only errors at the most.\nThe series and data frame is already given and the above code is only to illustrate it with an example.\nI am sure there is some easy way to that, but I can't figure it out.\n", "output": "Use the original df1 indexes to create the series:\ndf1['e'] = Series(np.random.randn(sLength), index=df1.index)\n\n\n\nEdit 2015\nSome reported to get the SettingWithCopyWarning with this code.\nHowever, the code still runs perfect with the current pandas version 0.16.1.\n>>> sLength = len(df1['a'])\n>>> df1\n a b c d\n6 -0.269221 -0.026476 0.997517 1.294385\n8 0.917438 0.847941 0.034235 -0.448948\n\n>>> df1['e'] = p.Series(np.random.randn(sLength), index=df1.index)\n>>> df1\n a b c d e\n6 -0.269221 -0.026476 0.997517 1.294385 1.757167\n8 0.917438 0.847941 0.034235 -0.448948 2.228131\n\n>>> p.version.short_version\n'0.16.1'\n\nThe SettingWithCopyWarning aims to inform of a possibly invalid assignment on a copy of the Dataframe. It doesn't necessarily say you did it wrong (it can trigger false positives) but from 0.13.0 it let you know there are more adequate methods for the same purpose. Then, if you get the warning, just follow its advise: Try using .loc[row_index,col_indexer] = value instead\n>>> df1.loc[:,'f'] = p.Series(np.random.randn(sLength), index=df1.index)\n>>> df1\n a b c d e f\n6 -0.269221 -0.026476 0.997517 1.294385 1.757167 -0.050927\n8 0.917438 0.847941 0.034235 -0.448948 2.228131 0.006109\n>>> \n\nIn fact, this is currently the more efficient method as described in pandas docs\n"} {"input": "Why do some functions have underscores \"__\" before and after the function name? This seems to occur a lot, and was wondering if this was a requirement in the Python language, or merely a matter of convention?\nAlso, could someone name and explain which functions tend to have the underscores, and why (__init__, for instance)?\n", "output": "From the Python PEP 8 -- Style Guide for Python Code (http://www.python.org/dev/peps/pep-0008/):\n\nthe following special forms using leading or trailing underscores are\n recognized (these can generally be combined with any case convention):\n\n_single_leading_underscore: weak \"internal use\" indicator. E.g. \"from M import *\" does not import objects whose name starts with an\n underscore.\nsingle_trailing_underscore_: used by convention to avoid conflicts with Python keyword, e.g.\nTkinter.Toplevel(master, class_='ClassName')\n__double_leading_underscore: when naming a class attribute, invokes name mangling (inside class FooBar, __boo becomes _FooBar__boo; see below). \n__double_leading_and_trailing_underscore__: \"magic\" objects or attributes that live in user-controlled namespaces. E.g. __init__,\n __import__ or __file__. Never invent such names; only use them as documented.\n\n\nNote that names with double leading and trailing underscores are essentially reserved for Python itself: \"Never invent such names; only use them as documented\".\n"} {"input": "How do I fix 'ImportError: cannot import name IncompleteRead'? When I try to install anything with pip or pip3, I get:\n$ sudo pip3 install python3-tk\nTraceback (most recent call last):\n File \"/usr/bin/pip3\", line 9, in \n load_entry_point('pip==1.5.6', 'console_scripts', 'pip3')()\n File \"/usr/lib/python3/dist-packages/pkg_resources.py\", line 356, in load_entry_point\n return get_distribution(dist).load_entry_point(group, name)\n File \"/usr/lib/python3/dist-packages/pkg_resources.py\", line 2476, in load_entry_point\n return ep.load()\n File \"/usr/lib/python3/dist-packages/pkg_resources.py\", line 2190, in load\n ['__name__'])\n File \"/usr/lib/python3/dist-packages/pip/__init__.py\", line 61, in \n from pip.vcs import git, mercurial, subversion, bazaar # noqa\n File \"/usr/lib/python3/dist-packages/pip/vcs/mercurial.py\", line 9, in \n from pip.download import path_to_url\n File \"/usr/lib/python3/dist-packages/pip/download.py\", line 25, in \n from requests.compat import IncompleteRead\nImportError: cannot import name 'IncompleteRead'\n\nI have a Ubuntu 14.10 system.\nHow can I fix this problem?\n", "output": "While this previous answer might be the reason, this snipped worked for me as a solution (in Ubuntu 14.04):\nFirst remove the package from the package manager:\n# apt-get remove python-pip\n\nAnd then install the latest version by side:\n# easy_install pip\n\n(thanks to @Aufziehvogel, @JunchaoGu)\n"} {"input": "How can I parse a YAML file in Python How can I parse a YAML file in Python?\n", "output": "The easiest and pureist method without relying on C headers is PyYaml:\n#!/usr/bin/env python\n\nimport yaml\n\nwith open(\"example.yaml\", 'r') as stream:\n try:\n print(yaml.load(stream))\n except yaml.YAMLError as exc:\n print(exc)\n\nErr.. that's it... how many lines of code would that take me in Java... any ideas? : ) more info here:\nhttp://pyyaml.org/wiki/PyYAMLDocumentation\n"} {"input": "How to avoid .pyc files? Can I run the python interpreter without generating the compiled .pyc files?\n", "output": "From \"What\u00e2\u0080\u0099s New in Python 2.6 - Interpreter Changes\":\n\nPython can now be prevented from\n writing .pyc or .pyo files by\n supplying the -B switch to the Python\n interpreter, or by setting the\n PYTHONDONTWRITEBYTECODE environment\n variable before running the\n interpreter. This setting is available\n to Python programs as the\n sys.dont_write_bytecode variable, and\n Python code can change the value to\n modify the interpreter\u00e2\u0080\u0099s behaviour.\n\nUpdate 2010-11-27: Python 3.2 addresses the issue of cluttering source folders with .pyc files by introducing a special __pycache__ subfolder, see What's New in Python 3.2 - PYC Repository Directories.\n"} {"input": "What does the at (@) symbol do in Python I'm looking at some Python code which used the @ symbol, but I have no idea what it does. I also do not know what to search for as searching python docs or Google does not return relevant results when the @ symbol is included.\n", "output": "Preamble\nI admit it took more than a few moments to fully grasp this concept for me, so I'll share what I've learned to save others the trouble.\nThe name decorator - the thing we define using the @ syntax before a function definition - was probably the main culprit here.\nExample\nclass Pizza(object):\n def __init__(self):\n self.toppings = []\n def __call__(self, topping):\n # when using '@instance_of_pizza' before a function def\n # the function gets passed onto 'topping'\n self.toppings.append(topping())\n def __repr__(self):\n return str(self.toppings)\n\npizza = Pizza()\n\n@pizza\ndef cheese():\n return 'cheese'\n@pizza\ndef sauce():\n return 'sauce'\n\nprint pizza\n# ['cheese', 'sauce']\n\nWhat this shows is that the function/method/class you're defining after a decorator is just basically passed on as an argument to the function/method immediatelly after the @ sign.\nFirst sighting\nThe microframework Flask introduces decorators from the very beginning in the following format:\nfrom flask import Flask\napp = Flask(__name__)\n\n@app.route(\"/\")\ndef hello():\n return \"Hello World!\"\n\nThis in turn translates to:\nrule = \"/\"\nview_func = hello\n# they go as arguments here in 'flask/app.py'\ndef add_url_rule(self, rule, endpoint=None, view_func=None, **options):\n pass\n\nRealizing this finally allowed me to feel at peace with flask.\n"} {"input": "Typical Angular.js workflow and project structure (with Python Flask) I am pretty new to this whole MV* client-side framework frenzy. It doesn't have to be Angular.js, but I picked it because it feels more natural to me than either Knockout, Ember or Backbone. Anyway what is the workflow like? Do people start with developing a client-side application in Angular.js and then hooking up the back-end to it?\nOr the other way around by first building the back-end in Django, Flask, Rails and then attaching an Angular.js app to it? Is there a \"right\" way of doing it, or is it just a personal preference in the end?\nI am also not sure whether to structure my project according to the Flask or Angular.js? community practices.\nFor example, Flask's minitwit app is structured like so:\nminitwit\n|-- minitwit.py\n|-- static\n |-- css, js, images, etc...\n`-- templates\n |-- html files and base layout\n\nAngular.js tutorial app is structured like this:\nangular-phonecat\n|-- app\n `-- css\n `-- img\n `-- js\n `-- lib\n `-- partials\n `-- index.html\n|-- scripts\n `-- node.js server and test server files\n\nI could picture a Flask app by itself, and it's fairly easy to see Angular.js app like ToDo List by itself but when it comes to using both of these technologies I don't understand how they work together. It almost seems like I don't need a server-side web-framework when you already have Angular.js, a simple Python web server will suffice. In the Angular to-do app for example they use MongoLab to talk to the database using Restful API. There was no need having a web framework on the back-end.\nMaybe I am just awfully confused, and Angular.js is nothing more than a fancy jQuery library so I should use just like I would use jQuery in my Flask projects (assuming I change Angular's template syntax to something that doesn't conflict with Jinja2). I hope my questions make some sense. I mainly work on the back-end and this client-side framework is an unknown territory for me.\n", "output": "I would start out by organizing the Flask app in the standard structure as follows:\napp\n|-- app.py\n|-- static\n |-- css\n |-- img\n |-- js\n|-- templates\n\nAnd as btford mentioned, if you are doing an Angular app, you'll want to focus on using Angular client-side templates and stay away from server-side templates. Using render_template('index.html') will cause Flask to interpret your angular templates as jinja templates, so they won't render correctly. Instead, you'll want to do the following:\n@app.route(\"/\")\ndef index():\n return send_file('templates/index.html')\n\nNote that using send_file() means that the files will be cached, so you might want to use make_response() instead, at least for development:\n return make_response(open('templates/index.html').read())\n\nAfterwards, build out the AngularJS part of your app, modifying the app structure so that it looks like this:\napp\n|-- app.py\n|-- static\n |-- css\n |-- img\n |-- js\n |-- app.js, controllers.js, etc.\n |-- lib\n |-- angular\n |-- angular.js, etc.\n |-- partials\n|-- templates\n |-- index.html\n\nMake sure your index.html includes AngularJS, as well as any other files:\n\n\nAt this point, you haven't yet constructed your RESTful API, so you can have your js controllers return predefined sample data (only a temporary setup). When you're ready, implement the RESTful API and hook it up to your angular app with angular-resource.js.\nEDIT: I put together an app template that, though a little more complex that what I've described above, illustrates how one could build an app with AngularJS + Flask, complete with communication between AngularJS and a simple Flask API. Here it is if you want to check it out: https://github.com/rxl/angular-flask\n"} {"input": "Should Python import statements always be at the top of a module? PEP 08 states:\n\nImports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants.\n\nHowever if the class/method/function that I am importing is only used in rare cases, surely it is more efficient to do the import when it is needed?\nIsn't this:\nclass SomeClass(object):\n\n def not_often_called(self)\n from datetime import datetime\n self.datetime = datetime.now()\n\nmore efficient than this?\nfrom datetime import datetime\n\nclass SomeClass(object):\n\n def not_often_called(self)\n self.datetime = datetime.now()\n\n", "output": "Module importing is quite fast, but not instant. This means that:\n\nPutting the imports at the top of the module is fine, because it's a trivial cost that's only paid once.\nPutting the imports within a function will cause calls to that function to take longer.\n\nSo if you care about efficiency, put the imports at the top. Only move them into a function if your profiling shows that would help (you did profile to see where best to improve performance, right??)\n\nThe best reasons I've seen to perform lazy imports are:\n\nOptional library support. If your code has multiple paths that use different libraries, don't break if an optional library is not installed.\nIn the __init__.py of a plugin, which might be imported but not actually used. Examples are Bazaar plugins, which use bzrlib's lazy-loading framework.\n\n"} {"input": "Pythonic way to avoid \"if x: return x\" statements I have a method that calls 4 other methods in sequence to check for specific conditions, and returns immediately (not checking the following ones) whenever one returns something Truthy.\ndef check_all_conditions():\n x = check_size()\n if x:\n return x\n\n x = check_color()\n if x:\n return x\n\n x = check_tone()\n if x:\n return x\n\n x = check_flavor()\n if x:\n return x\n return None\n\nThis seems like a lot of baggage code. Instead of each 2-line if statement, I'd rather do something like:\nx and return x\n\nBut that is invalid Python. Am I missing a simple, elegant solution here? Incidentally, in this situation, those four check methods may be expensive, so I do not want to call them multiple times.\n", "output": "Alternatively to Martijn's fine answer, you could chain or. This will return the first truthy value, or None if there's no truthy value:\ndef check_all_conditions():\n return check_size() or check_color() or check_tone() or check_flavor() or None\n\nDemo:\n>>> x = [] or 0 or {} or -1 or None\n>>> x\n-1\n>>> x = [] or 0 or {} or '' or None\n>>> x is None\nTrue\n\n"} {"input": "Get unique values from a list in python I want to get the unique values from the following list:\n[u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']\n\nThe output which I require is:\n[u'nowplaying', u'PBS', u'job', u'debate', u'thenandnow']\n\nI tried the following code:\noutput = []\nfor x in trends:\n if x not in output:\n output.append(x)\nprint output\n\nbut it didn't work. Any help would be appreciated.\n", "output": "First declare your list properly, separated by commas\nYou can get the unique values by converting the list to a set\nmylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']\nmyset = set(mylist)\nprint myset\n\nIf you use it further as a list, you should convert it back to list by doing\nmynewlist = list(myset)\n\nAnother possibility, probably faster would be to use a set from the beginning, instead of a list. Then your code should be\n output = set()\n for x in trends:\n output.add(x)\n print output\n\nAs it has been pointed out, the sets do not maintain the original order. If you need so, you should look up about the ordered set \n"} {"input": "How can I fill out a Python string with spaces? I want to fill out a string with spaces. I know that the following works for zero's:\n>>> print \"'%06d'\"%4\n'000004'\n\nBut what should I do when I want this?:\n'hi '\n\nof course I can measure string length and do str+\" \"*leftover, but I'd like the shortest way.\n", "output": "You can do this with str.ljust(width[, fillchar]):\n\nReturn the string left justified in a string of length width. Padding is done using the specified fillchar (default is a space). The original string is returned if width is less than len(s).\n\n>>> 'hi'.ljust(10)\n'hi '\n\n"} {"input": "How to set the current working directory in Python? How to set the current working directory in Python?\n", "output": "Try os.chdir\n\nos.chdir(path)\n\n\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0\u00c2\u00a0Change the current working directory to path. Availability: Unix, Windows.\n\n"} {"input": "Can I get JSON to load into an OrderedDict in Python? Ok so I can use an OrderedDict in json.dump. That is, an OrderedDict can be used as an input to JSON.\nBut can it be used as an output? If so how? In my case I'd like to load into an OrderedDict so I can keep the order of the keys in the file.\nIf not, is there some kind of workaround?\n", "output": "Yes, you can. By specifying the object_pairs_hook argument to JSONDecoder. In fact, this is the exact example given in the documentation.\n>>> json.JSONDecoder(object_pairs_hook=collections.OrderedDict).decode('{\"foo\":1, \"bar\": 2}')\nOrderedDict([('foo', 1), ('bar', 2)])\n>>> \n\nYou can pass this parameter to json.loads (if you don't need a Decoder instance for other purposes) like so:\n>>> import json\n>>> from collections import OrderedDict\n>>> data = json.loads('{\"foo\":1, \"bar\": 2}', object_pairs_hook=OrderedDict)\n>>> print json.dumps(data, indent=4)\n{\n \"foo\": 1,\n \"bar\": 2\n}\n>>> \n\nUsing json.load is done in the same way:\n>>> data = json.load(open('config.json'), object_pairs_hook=OrderedDict)\n\n"} {"input": "TypeError: 'str' does not support the buffer interface plaintext = input(\"Please enter the text you want to compress\")\nfilename = input(\"Please enter the desired filename\")\nwith gzip.open(filename + \".gz\", \"wb\") as outfile:\n outfile.write(plaintext) \n\nThe above python code is giving me following error:\nTraceback (most recent call last):\n File \"C:/Users/Ankur Gupta/Desktop/Python_works/gzip_work1.py\", line 33, in \n compress_string()\n File \"C:/Users/Ankur Gupta/Desktop/Python_works/gzip_work1.py\", line 15, in compress_string\n outfile.write(plaintext)\n File \"C:\\Python32\\lib\\gzip.py\", line 312, in write\n self.crc = zlib.crc32(data, self.crc) & 0xffffffff\nTypeError: 'str' does not support the buffer interface\n\n", "output": "If you use Python3x then string is not the same type as for Python 2.x, you must cast it to bytes (encode it).\nplaintext = input(\"Please enter the text you want to compress\")\nfilename = input(\"Please enter the desired filename\")\nwith gzip.open(filename + \".gz\", \"wb\") as outfile:\n outfile.write(bytes(plaintext, 'UTF-8'))\n\nAlso do not use variable names like string or file while those are names of module or function.\nEDIT @Tom\nYes, non-ASCII text is also compressed/decompressed. I use Polish letters with UTF-8 encoding:\nplaintext = 'Polish text: \u00c4\u0085\u00c4\u0087\u00c4\u0099\u00c5\u0082\u00c5\u0084\u00c3\u00b3\u00c5\u009b\u00c5\u00ba\u00c5\u00bc\u00c4\u0084\u00c4\u0086\u00c4\u0098\u00c5\u0081\u00c5\u0083\u00c3\u0093\u00c5\u009a\u00c5\u00b9\u00c5\u00bb'\nfilename = 'foo.gz'\nwith gzip.open(filename, 'wb') as outfile:\n outfile.write(bytes(plaintext, 'UTF-8'))\nwith gzip.open(filename, 'r') as infile:\n outfile_content = infile.read().decode('UTF-8')\nprint(outfile_content)\n\n"} {"input": "Understanding dict.copy() - shallow or deep? While reading up the documentation for dict.copy(), it says that it makes a shallow copy of the dictionary. Same goes for the book I am following (Beazley's Python Reference), which says: \n\nThe m.copy() method makes a shallow\n copy of the items contained in a\n mapping object and places them in a\n new mapping object.\n\nConsider this:\n>>> original = dict(a=1, b=2)\n>>> new = original.copy()\n>>> new.update({'c': 3})\n>>> original\n{'a': 1, 'b': 2}\n>>> new\n{'a': 1, 'c': 3, 'b': 2}\n\nSo I assumed this would update the value of original (and add 'c': 3) also since I was doing a shallow copy. Like if you do it for a list:\n>>> original = [1, 2, 3]\n>>> new = original\n>>> new.append(4)\n>>> new, original\n([1, 2, 3, 4], [1, 2, 3, 4])\n\nThis works as expected.\nSince both are shallow copies, why is that the dict.copy() doesn't work as I expect it to? Or my understanding of shallow vs deep copying is flawed?\n", "output": "By \"shallow copying\" it means the content of the dictionary is not copied by value, but just creating a new reference.\n>>> a = {1: [1,2,3]}\n>>> b = a.copy()\n>>> a, b\n({1: [1, 2, 3]}, {1: [1, 2, 3]})\n>>> a[1].append(4)\n>>> a, b\n({1: [1, 2, 3, 4]}, {1: [1, 2, 3, 4]})\n\nIn contrast, a deep copy will copy all contents by value.\n>>> c = copy.deepcopy(a)\n>>> a, c\n({1: [1, 2, 3, 4]}, {1: [1, 2, 3, 4]})\n>>> a[1].append(5)\n>>> a, c\n({1: [1, 2, 3, 4, 5]}, {1: [1, 2, 3, 4]})\n\nSo:\n\na = b: Reference assignment, Make a and b points to the same object.\na ---,\n v\n {1: L}\n ^ |\nb ---' '----> [1,2,3]\n\na = b.copy(): Shallow copying, a and b will become two isolated objects, but their contents still share the same reference\na ---> {1: L}\n | \n >---> [1,2,3]\n |\nb ---> {1: M}\n\na = copy.deepcopy(b): Deep copying, a and b's structure and content become completely isolated.\na ---> {1: L}\n \u00e2\u0080\u0098-----> [1,2,3]\nb ---> {1: M}\n \u00e2\u0080\u0098-----> [1,2,3]\n\n\n"} {"input": "mysql_config not found when installing mysqldb python interface I am trying to get a Python script to run on the linux server I'm connected to via ssh. The script uses mysqldb. I have all the other components I need, but when I try to install mySQLdb via setuptools like so:, \npython setup.py install\n\nI get the following error report related to the mysql_config command.\nsh: mysql_config: command not found\nTraceback (most recent call last):\n File \"setup.py\", line 15, in \n metadata, options = get_config()\n File \"/usr/lib/python2.5/MySQL-python-1.2.3/setup_posix.py\", line 43, in get_config\n libs = mysql_config(\"libs_r\")\n File \"/usr/lib/python2.5/MySQL-python-1.2.3/setup_posix.py\", line 24, in mysql_config\n raise EnvironmentError(\"%s not found\" % (mysql_config.path,))\nEnvironmentError: mysql_config not found\n\nHas anyone else encountered this error and if so how did you resolve it/what can I do to successfully install mysqldb? \n", "output": "mySQLdb is a python interface for mysql, but it is not mysql itself. And apparently mySQLdb needs the command 'mysql_config', so you need to install that first.\nCan you confirm that you did or did not install mysql itself, by running \"mysql\" from the shell? That should give you a response other than \"mysql: command not found\". \nWhich linux distribution are you using? Mysql is pre-packaged for most linux distributions. For example, for debian / ubuntu, installing mysql is as easy as \nsudo apt-get install mysql-server\n\nmysql-config is in a different package, which can be installed from (again, assuming debian / ubuntu):\nsudo apt-get install libmysqlclient-dev\n\nif you are using mariadb, the drop in replacement for mysql, then run\nsudo apt-get install libmariadbclient-dev\n\n"} {"input": "How to make an unaware datetime timezone aware in python What I need to do\nI have a timezone-unaware datetime object, to which I need to add a time zone in order to be able to compare it with other timezone-aware datetime objects. I do not want to convert my entire application to timezone unaware for this one legacy case.\nWhat I've Tried\nFirst, to demonstrate the problem:\nPython 2.6.1 (r261:67515, Jun 24 2010, 21:47:49) \n[GCC 4.2.1 (Apple Inc. build 5646)] on darwin\nType \"help\", \"copyright\", \"credits\" or \"license\" for more information.\n>>> import datetime\n>>> import pytz\n>>> unaware = datetime.datetime(2011,8,15,8,15,12,0)\n>>> unaware\ndatetime.datetime(2011, 8, 15, 8, 15, 12)\n>>> aware = datetime.datetime(2011,8,15,8,15,12,0,pytz.UTC)\n>>> aware\ndatetime.datetime(2011, 8, 15, 8, 15, 12, tzinfo=)\n>>> aware == unaware\nTraceback (most recent call last):\n File \"\", line 1, in \nTypeError: can't compare offset-naive and offset-aware datetimes\n\nFirst, I tried astimezone:\n>>> unaware.astimezone(pytz.UTC)\nTraceback (most recent call last):\n File \"\", line 1, in \nValueError: astimezone() cannot be applied to a naive datetime\n>>>\n\nIt's not terribly surprising this failed, since it's actually trying to do a conversion. Replace seemed like a better choice (as per Python: How to get a value of datetime.today() that is \"timezone aware\"?):\n>>> unaware.replace(tzinfo=pytz.UTC)\ndatetime.datetime(2011, 8, 15, 8, 15, 12, tzinfo=)\n>>> unaware == aware\nTraceback (most recent call last):\n File \"\", line 1, in \nTypeError: can't compare offset-naive and offset-aware datetimes\n>>> \n\nBut as you can see, replace seems to set the tzinfo, but not make the object aware. I'm getting ready to fall back to doctoring the input string to have a timezone before parsing it (I'm using dateutil for parsing, if that matters), but that seems incredibly kludgy.\nAlso, I've tried this in both python 2.6 and python 2.7, with the same results.\nContext\nI am writing a parser for some data files. There is an old format I need to support where the date string does not have a timezone indicator. I've already fixed the data source, but I still need to support the legacy data format. A one time conversion of the legacy data is not an option for various business BS reasons. While in general, I do not like the idea of hard-coding a default timezone, in this case it seems like the best option. I know with reasonable confidence that all the legacy data in question is in UTC, so I'm prepared to accept the risk of defaulting to that in this case.\n", "output": "In general, to make a naive datetime timezone-aware, use the localize method:\nimport datetime\nimport pytz\n\nunaware = datetime.datetime(2011, 8, 15, 8, 15, 12, 0)\naware = datetime.datetime(2011, 8, 15, 8, 15, 12, 0, pytz.UTC)\n\nnow_aware = pytz.utc.localize(unaware)\nassert aware == now_aware\n\nFor the UTC timezone, it is not really necessary to use localize since there is no daylight savings time calculation to handle:\nnow_aware = unaware.replace(tzinfo=pytz.UTC)\n\nworks. (.replace returns a new datetime; it does not modify unaware.)\n"} {"input": "How to Python prettyprint a JSON file I have a JSON file that is a mess that I want to prettyprint-- what's the easiest way to do this in python? I know PrettyPrint takes an \"object\", which I think can be a file, but I don't know how to pass a file in-- just using the filename doesn't work.\n", "output": "The json module already implements some basic pretty printing with the indent parameter:\n>>> import json\n>>>\n>>> your_json = '[\"foo\", {\"bar\":[\"baz\", null, 1.0, 2]}]'\n>>> parsed = json.loads(your_json)\n>>> print json.dumps(parsed, indent=4, sort_keys=True)\n[\n \"foo\", \n {\n \"bar\": [\n \"baz\", \n null, \n 1.0, \n 2\n ]\n }\n]\n\nTo parse a file, use json.load:\nwith open('filename.txt', 'r') as handle:\n parsed = json.load(handle)\n\n"} {"input": "How to get the position of a character in Python? How can I get the position of a character inside a string in python?\n", "output": "There are two string methods for this, find and index.\nExample:\n>>> myString = \"Position of a character\"\n>>> myString.index('s')\n2\n\nThe difference is that find returns -1 when what you're looking for isn't found. index throws an exception.\n\nFunction str.find(sub[, start[, end]])\n Return the lowest index in the string where substring sub is found, such that sub is contained in the range [start, end]. Optional arguments start and end are interpreted as in slice notation. Return -1 if sub is not found. \n\nAnd:\n\nFunction str.index(sub[, start[, end]])\n Like find(), but raise ValueError when the substring is not found.\n\n"} {"input": "Django - How to rename a model field using South? I would like to change a name of specific fields in a model:\nclass Foo(models.Model):\n name = models.CharField()\n rel = models.ForeignKey(Bar)\n\nshould change to:\nclass Foo(models.Model):\n full_name = models.CharField()\n odd_relation = models.ForeignKey(Bar)\n\nWhat's the easiest way to do this using South?\n", "output": "You can use the db.rename_column function.\nclass Migration:\n\n def forwards(self, orm):\n # Rename 'name' field to 'full_name'\n db.rename_column('app_foo', 'name', 'full_name')\n\n\n\n\n def backwards(self, orm):\n # Rename 'full_name' field to 'name'\n db.rename_column('app_foo', 'full_name', 'name')\n\nThe first argument of db.rename_column is the table name, so it's important to remember how Django creates table names:\n\nDjango automatically derives the name of the database table from the name of your model class and the app that contains it. A model's database table name is constructed by joining the model's \"app label\" -- the name you used in manage.py startapp -- to the model's class name, with an underscore between them.\n\nIn the case where you have a multi-worded, camel-cased model name, such as ProjectItem, the table name will be app_projectitem (i.e., an underscore will not be inserted between project and item even though they are camel-cased).\n"} {"input": "Reading binary file in Python and looping over each byte In Python, how do I read in a binary file and loop over each byte of that file?\n", "output": "f = open(\"myfile\", \"rb\")\ntry:\n byte = f.read(1)\n while byte != \"\":\n # Do stuff with byte.\n byte = f.read(1)\nfinally:\n f.close()\n\nBy suggestion of chrispy:\nwith open(\"myfile\", \"rb\") as f:\n byte = f.read(1)\n while byte != \"\":\n # Do stuff with byte.\n byte = f.read(1)\n\nNote that the with statement is not available in versions of Python below 2.5. To use it in v 2.5 you'll need to import it:\nfrom __future__ import with_statement\n\nIn 2.6 this is not needed.\nIn Python 3, it's a bit different. We will no longer get raw characters from the stream in byte mode but byte objects, thus we need to alter the condition:\nwith open(\"myfile\", \"rb\") as f:\n byte = f.read(1)\n while byte != b\"\":\n # Do stuff with byte.\n byte = f.read(1)\n\nOr as benhoyt says, skip the not equal and take advantage of the fact that b\"\" evaluates to false. This makes the code compatible between 2.6 and 3.x without any changes. It would also save you from changing the condition if you go from byte mode to text or the reverse.\nwith open(\"myfile\", \"rb\") as f:\n byte = f.read(1)\n while byte:\n # Do stuff with byte.\n byte = f.read(1)\n\n"} {"input": "How to declare an array in Python? How do I declare an array in Python?\nI can't find any reference to arrays in the documentation. \n", "output": "variable = []\n\nNow variable refers to an empty list*.\nOf course this is an assignment, not a declaration. There's no way to say in Python \"this variable should never refer to anything other than a list\", since Python is dynamically typed.\n\n*The default built-in Python type is called a list, not an array. It is an ordered container of arbitrary length that can hold a heterogenous collection of objects (their types do not matter and can be freely mixed). This should not be confused with the array module, which offers a type closer to the C array type; the contents must be homogenous (all of the same type), but the length is still dynamic.\n"} {"input": "Using module 'subprocess' with timeout Here's the Python code to run an arbitrary command returning its stdout data, or raise an exception on non-zero exit codes:\nproc = subprocess.Popen(\n cmd,\n stderr=subprocess.STDOUT, # Merge stdout and stderr\n stdout=subprocess.PIPE,\n shell=True)\n\ncommunicate is used to wait for the process to exit:\nstdoutdata, stderrdata = proc.communicate()\n\nThe subprocess module does not support timeout--ability to kill a process running for more than X number of seconds--therefore, communicate may take forever to run.\nWhat is the simplest way to implement timeouts in a Python program meant to run on Windows and Linux?\n", "output": "I don't know much about the low level details; but, given that in\npython 2.6 the API offers the ability to wait for threads and\nterminate processes, what about running the process in a separate\nthread?\nimport subprocess, threading\n\nclass Command(object):\n def __init__(self, cmd):\n self.cmd = cmd\n self.process = None\n\n def run(self, timeout):\n def target():\n print 'Thread started'\n self.process = subprocess.Popen(self.cmd, shell=True)\n self.process.communicate()\n print 'Thread finished'\n\n thread = threading.Thread(target=target)\n thread.start()\n\n thread.join(timeout)\n if thread.is_alive():\n print 'Terminating process'\n self.process.terminate()\n thread.join()\n print self.process.returncode\n\ncommand = Command(\"echo 'Process started'; sleep 2; echo 'Process finished'\")\ncommand.run(timeout=3)\ncommand.run(timeout=1)\n\nThe output of this snippet in my machine is:\nThread started\nProcess started\nProcess finished\nThread finished\n0\nThread started\nProcess started\nTerminating process\nThread finished\n-15\n\nwhere it can be seen that, in the first execution, the process\nfinished correctly (return code 0), while the in the second one the\nprocess was terminated (return code -15).\nI haven't tested in windows; but, aside from updating the example\ncommand, I think it should work since I haven't found in the\ndocumentation anything that says that thread.join or process.terminate\nis not supported.\n"} {"input": "pip install lxml error this is my error :\n(mysite)zjm1126@zjm1126-G41MT-S2:~/zjm_test/mysite$ pip install lxml\nDownloading/unpacking lxml\n Running setup.py egg_info for package lxml\n Building lxml version 2.3.\n Building without Cython.\n ERROR: /bin/sh: xslt-config: not found\n\n ** make sure the development packages of libxml2 and libxslt are installed **\n\n Using build configuration of libxslt\nInstalling collected packages: lxml\n Running setup.py install for lxml\n Building lxml version 2.3.\n Building without Cython.\n ERROR: /bin/sh: xslt-config: not found\n\n ** make sure the development packages of libxml2 and libxslt are installed **\n\n Using build configuration of libxslt\n building 'lxml.etree' extension\n gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/include/python2.6 -c src/lxml/lxml.etree.c -o build/temp.linux-i686-2.6/src/lxml/lxml.etree.o -w\n src/lxml/lxml.etree.c:4: fatal error: Python.h: \u00e6\u00b2\u00a1\u00e6\u009c\u0089\u00e9\u0082\u00a3\u00e4\u00b8\u00aa\u00e6\u0096\u0087\u00e4\u00bb\u00b6\u00e6\u0088\u0096\u00e7\u009b\u00ae\u00e5\u00bd\u0095\n compilation terminated.\n error: command 'gcc' failed with exit status 1\n Complete output from command /home/zjm1126/zjm_test/mysite/bin/python -c \"import setuptools;__file__='/home/zjm1126/zjm_test/mysite/build/lxml/setup.py';execfile(__file__)\" install --single-version-externally-managed --record /tmp/pip-jOhgvD-record/install-record.txt --install-headers /home/zjm1126/zjm_test/mysite/include/site/python2.6:\n Building lxml version 2.3.\n\nBuilding without Cython.\n\nERROR: /bin/sh: xslt-config: not found\n\n\n\n** make sure the development packages of libxml2 and libxslt are installed **\n\n\n\nUsing build configuration of libxslt\n\nrunning install\n\nrunning build\n\nrunning build_py\n\nrunning build_ext\n\nbuilding 'lxml.etree' extension\n\ngcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/include/python2.6 -c src/lxml/lxml.etree.c -o build/temp.linux-i686-2.6/src/lxml/lxml.etree.o -w\n\nsrc/lxml/lxml.etree.c:4: fatal error: Python.h: \u00e6\u00b2\u00a1\u00e6\u009c\u0089\u00e9\u0082\u00a3\u00e4\u00b8\u00aa\u00e6\u0096\u0087\u00e4\u00bb\u00b6\u00e6\u0088\u0096\u00e7\u009b\u00ae\u00e5\u00bd\u0095\n\ncompilation terminated.\n\nerror: command 'gcc' failed with exit status 1\n\n----------------------------------------\nCommand /home/zjm1126/zjm_test/mysite/bin/python -c \"import setuptools;__file__='/home/zjm1126/zjm_test/mysite/build/lxml/setup.py';execfile(__file__)\" install --single-version-externally-managed --record /tmp/pip-jOhgvD-record/install-record.txt --install-headers /home/zjm1126/zjm_test/mysite/include/site/python2.6 failed with error code 1\nStoring complete log in /home/zjm1126/.pip/pip.log\n\nwhat can i do ,\nthanks\nupdated:\n(mysite)zjm1126@zjm1126-G41MT-S2:~/zjm_test/mysite$ pip install lxml\nDownloading/unpacking lxml\n Running setup.py egg_info for package lxml\n Building lxml version 2.3.\n Building without Cython.\n Using build configuration of libxslt 1.1.26\n Building against libxml2/libxslt in the following directory: /usr/lib\nInstalling collected packages: lxml\n Running setup.py install for lxml\n Building lxml version 2.3.\n Building without Cython.\n Using build configuration of libxslt 1.1.26\n Building against libxml2/libxslt in the following directory: /usr/lib\n building 'lxml.etree' extension\n gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/include/libxml2 -I/usr/include/python2.6 -c src/lxml/lxml.etree.c -o build/temp.linux-i686-2.6/src/lxml/lxml.etree.o -w\n src/lxml/lxml.etree.c:4: fatal error: Python.h: \u00e6\u00b2\u00a1\u00e6\u009c\u0089\u00e9\u0082\u00a3\u00e4\u00b8\u00aa\u00e6\u0096\u0087\u00e4\u00bb\u00b6\u00e6\u0088\u0096\u00e7\u009b\u00ae\u00e5\u00bd\u0095\n compilation terminated.\n error: command 'gcc' failed with exit status 1\n Complete output from command /home/zjm1126/zjm_test/mysite/bin/python -c \"import setuptools;__file__='/home/zjm1126/zjm_test/mysite/build/lxml/setup.py';execfile(__file__)\" install --single-version-externally-managed --record /tmp/pip-NJw2ws-record/install-record.txt --install-headers /home/zjm1126/zjm_test/mysite/include/site/python2.6:\n Building lxml version 2.3.\n\nBuilding without Cython.\n\nUsing build configuration of libxslt 1.1.26\n\nBuilding against libxml2/libxslt in the following directory: /usr/lib\n\nrunning install\n\nrunning build\n\nrunning build_py\n\nrunning build_ext\n\nbuilding 'lxml.etree' extension\n\ngcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/include/libxml2 -I/usr/include/python2.6 -c src/lxml/lxml.etree.c -o build/temp.linux-i686-2.6/src/lxml/lxml.etree.o -w\n\nsrc/lxml/lxml.etree.c:4: fatal error: Python.h: \u00e6\u00b2\u00a1\u00e6\u009c\u0089\u00e9\u0082\u00a3\u00e4\u00b8\u00aa\u00e6\u0096\u0087\u00e4\u00bb\u00b6\u00e6\u0088\u0096\u00e7\u009b\u00ae\u00e5\u00bd\u0095\n\ncompilation terminated.\n\nerror: command 'gcc' failed with exit status 1\n\n----------------------------------------\nCommand /home/zjm1126/zjm_test/mysite/bin/python -c \"import setuptools;__file__='/home/zjm1126/zjm_test/mysite/build/lxml/setup.py';execfile(__file__)\" install --single-version-externally-managed --record /tmp/pip-NJw2ws-record/install-record.txt --install-headers /home/zjm1126/zjm_test/mysite/include/site/python2.6 failed with error code 1\nStoring complete log in /home/zjm1126/.pip/pip.log\n\nthe log :\n------------------------------------------------------------\n/home/zjm1126/zjm_test/mysite/bin/pip run on Thu Mar 3 17:07:27 2011\nDownloading/unpacking mysql-python\n Running setup.py egg_info for package mysql-python\n running egg_info\n creating pip-egg-info/MySQL_python.egg-info\n writing pip-egg-info/MySQL_python.egg-info/PKG-INFO\n writing top-level names to pip-egg-info/MySQL_python.egg-info/top_level.txt\n writing dependency_links to pip-egg-info/MySQL_python.egg-info/dependency_links.txt\n writing pip-egg-info/MySQL_python.egg-info/PKG-INFO\n writing top-level names to pip-egg-info/MySQL_python.egg-info/top_level.txt\n writing dependency_links to pip-egg-info/MySQL_python.egg-info/dependency_links.txt\n writing manifest file 'pip-egg-info/MySQL_python.egg-info/SOURCES.txt'\n warning: manifest_maker: standard file '-c' not found\n reading manifest file 'pip-egg-info/MySQL_python.egg-info/SOURCES.txt'\n reading manifest template 'MANIFEST.in'\n warning: no files found matching 'MANIFEST'\n warning: no files found matching 'ChangeLog'\n warning: no files found matching 'GPL'\n writing manifest file 'pip-egg-info/MySQL_python.egg-info/SOURCES.txt'\nInstalling collected packages: mysql-python\n Running setup.py install for mysql-python\n Running command /home/zjm1126/zjm_test/mysite/bin/python -c \"import setuptools;__file__='/home/zjm1126/zjm_test/mysite/build/mysql-python/setup.py';execfile(__file__)\" install --single-version-externally-managed --record /tmp/pip-XuVIux-record/install-record.txt --install-headers /home/zjm1126/zjm_test/mysite/include/site/python2.6\n running install\n running build\n running build_py\n creating build\n creating build/lib.linux-i686-2.6\n copying _mysql_exceptions.py -> build/lib.linux-i686-2.6\n creating build/lib.linux-i686-2.6/MySQLdb\n copying MySQLdb/__init__.py -> build/lib.linux-i686-2.6/MySQLdb\n copying MySQLdb/converters.py -> build/lib.linux-i686-2.6/MySQLdb\n copying MySQLdb/connections.py -> build/lib.linux-i686-2.6/MySQLdb\n copying MySQLdb/cursors.py -> build/lib.linux-i686-2.6/MySQLdb\n copying MySQLdb/release.py -> build/lib.linux-i686-2.6/MySQLdb\n copying MySQLdb/times.py -> build/lib.linux-i686-2.6/MySQLdb\n creating build/lib.linux-i686-2.6/MySQLdb/constants\n copying MySQLdb/constants/__init__.py -> build/lib.linux-i686-2.6/MySQLdb/constants\n copying MySQLdb/constants/CR.py -> build/lib.linux-i686-2.6/MySQLdb/constants\n copying MySQLdb/constants/FIELD_TYPE.py -> build/lib.linux-i686-2.6/MySQLdb/constants\n copying MySQLdb/constants/ER.py -> build/lib.linux-i686-2.6/MySQLdb/constants\n copying MySQLdb/constants/FLAG.py -> build/lib.linux-i686-2.6/MySQLdb/constants\n copying MySQLdb/constants/REFRESH.py -> build/lib.linux-i686-2.6/MySQLdb/constants\n copying MySQLdb/constants/CLIENT.py -> build/lib.linux-i686-2.6/MySQLdb/constants\n running build_ext\n building '_mysql' extension\n creating build/temp.linux-i686-2.6\n gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -Dversion_info=(1,2,3,'final',0) -D__version__=1.2.3 -I/usr/include/mysql -I/usr/include/python2.6 -c _mysql.c -o build/temp.linux-i686-2.6/_mysql.o -DBIG_JOINS=1 -fno-strict-aliasing -DUNIV_LINUX -DUNIV_LINUX\n In file included from _mysql.c:29:\n pymemcompat.h:10: fatal error: Python.h: \u00e6\u00b2\u00a1\u00e6\u009c\u0089\u00e9\u0082\u00a3\u00e4\u00b8\u00aa\u00e6\u0096\u0087\u00e4\u00bb\u00b6\u00e6\u0088\u0096\u00e7\u009b\u00ae\u00e5\u00bd\u0095\n compilation terminated.\n error: command 'gcc' failed with exit status 1\n Complete output from command /home/zjm1126/zjm_test/mysite/bin/python -c \"import setuptools;__file__='/home/zjm1126/zjm_test/mysite/build/mysql-python/setup.py';execfile(__file__)\" install --single-version-externally-managed --record /tmp/pip-XuVIux-record/install-record.txt --install-headers /home/zjm1126/zjm_test/mysite/include/site/python2.6:\n running install\n\nrunning build\n\nrunning build_py\n\ncreating build\n\ncreating build/lib.linux-i686-2.6\n\ncopying _mysql_exceptions.py -> build/lib.linux-i686-2.6\n\ncreating build/lib.linux-i686-2.6/MySQLdb\n\ncopying MySQLdb/__init__.py -> build/lib.linux-i686-2.6/MySQLdb\n\ncopying MySQLdb/converters.py -> build/lib.linux-i686-2.6/MySQLdb\n\ncopying MySQLdb/connections.py -> build/lib.linux-i686-2.6/MySQLdb\n\ncopying MySQLdb/cursors.py -> build/lib.linux-i686-2.6/MySQLdb\n\ncopying MySQLdb/release.py -> build/lib.linux-i686-2.6/MySQLdb\n\ncopying MySQLdb/times.py -> build/lib.linux-i686-2.6/MySQLdb\n\ncreating build/lib.linux-i686-2.6/MySQLdb/constants\n\ncopying MySQLdb/constants/__init__.py -> build/lib.linux-i686-2.6/MySQLdb/constants\n\ncopying MySQLdb/constants/CR.py -> build/lib.linux-i686-2.6/MySQLdb/constants\n\ncopying MySQLdb/constants/FIELD_TYPE.py -> build/lib.linux-i686-2.6/MySQLdb/constants\n\ncopying MySQLdb/constants/ER.py -> build/lib.linux-i686-2.6/MySQLdb/constants\n\ncopying MySQLdb/constants/FLAG.py -> build/lib.linux-i686-2.6/MySQLdb/constants\n\ncopying MySQLdb/constants/REFRESH.py -> build/lib.linux-i686-2.6/MySQLdb/constants\n\ncopying MySQLdb/constants/CLIENT.py -> build/lib.linux-i686-2.6/MySQLdb/constants\n\nrunning build_ext\n\nbuilding '_mysql' extension\n\ncreating build/temp.linux-i686-2.6\n\ngcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -Dversion_info=(1,2,3,'final',0) -D__version__=1.2.3 -I/usr/include/mysql -I/usr/include/python2.6 -c _mysql.c -o build/temp.linux-i686-2.6/_mysql.o -DBIG_JOINS=1 -fno-strict-aliasing -DUNIV_LINUX -DUNIV_LINUX\n\nIn file included from _mysql.c:29:\n\npymemcompat.h:10: fatal error: Python.h: \u00e6\u00b2\u00a1\u00e6\u009c\u0089\u00e9\u0082\u00a3\u00e4\u00b8\u00aa\u00e6\u0096\u0087\u00e4\u00bb\u00b6\u00e6\u0088\u0096\u00e7\u009b\u00ae\u00e5\u00bd\u0095\n\ncompilation terminated.\n\nerror: command 'gcc' failed with exit status 1\n\n----------------------------------------\nCommand /home/zjm1126/zjm_test/mysite/bin/python -c \"import setuptools;__file__='/home/zjm1126/zjm_test/mysite/build/mysql-python/setup.py';execfile(__file__)\" install --single-version-externally-managed --record /tmp/pip-XuVIux-record/install-record.txt --install-headers /home/zjm1126/zjm_test/mysite/include/site/python2.6 failed with error code 1\nException information:\nTraceback (most recent call last):\n File \"/home/zjm1126/zjm_test/mysite/lib/python2.6/site-packages/pip-0.8.1-py2.6.egg/pip/basecommand.py\", line 130, in main\n self.run(options, args)\n File \"/home/zjm1126/zjm_test/mysite/lib/python2.6/site-packages/pip-0.8.1-py2.6.egg/pip/commands/install.py\", line 228, in run\n requirement_set.install(install_options, global_options)\n File \"/home/zjm1126/zjm_test/mysite/lib/python2.6/site-packages/pip-0.8.1-py2.6.egg/pip/req.py\", line 1043, in install\n requirement.install(install_options, global_options)\n File \"/home/zjm1126/zjm_test/mysite/lib/python2.6/site-packages/pip-0.8.1-py2.6.egg/pip/req.py\", line 559, in install\n cwd=self.source_dir, filter_stdout=self._filter_install, show_stdout=False)\n File \"/home/zjm1126/zjm_test/mysite/lib/python2.6/site-packages/pip-0.8.1-py2.6.egg/pip/__init__.py\", line 249, in call_subprocess\n % (command_desc, proc.returncode))\nInstallationError: Command /home/zjm1126/zjm_test/mysite/bin/python -c \"import setuptools;__file__='/home/zjm1126/zjm_test/mysite/build/mysql-python/setup.py';execfile(__file__)\" install --single-version-externally-managed --record /tmp/pip-XuVIux-record/install-record.txt --install-headers /home/zjm1126/zjm_test/mysite/include/site/python2.6 failed with error code 1\n\n", "output": "\n** make sure the development packages of libxml2 and libxslt are installed **\n\nAssuming you are running a Debian-based distribution\nsudo apt-get install python-dev libxml2-dev libxslt1-dev zlib1g-dev\n\nIf you're installing for Python 3 on Debian-based distributions, you'll want to install python3-dev too.\n"} {"input": "Importing modules from parent folder I am running Python 2.5.\nThis is my folder tree:\nptdraft/\n nib.py\n simulations/\n life/\n life.py\n\n(I also have __init__.py in each folder, omitted here for readability)\nHow do I import the nib module from inside the life module? I am hoping it is possible to do without tinkering with sys.path.\nNote: The main module being run is in the ptdraft folder.\n", "output": "You could use relative imports (python >= 2.5):\nfrom ... import nib\n\n(What\u00e2\u0080\u0099s New in Python 2.5) PEP 328: Absolute and Relative Imports\nEDIT: added another dot '.' to go up two packages\n"} {"input": "pip install from github repo branch Seems to be rather straightforward but I just can't get it right. Trying to pip install a repo's specific branch. Googling tells me to \n\npip install https://github.com/user/repo.git@branch\n\nThe branch's name is issue/34/oscar-0.6 so I did pip install https://github.com/tangentlabs/django-oscar-paypal.git@/issue/34/oscar-0.6 but its returning a 404.\nHow do I install this branch?\n", "output": "Prepend the url prefix git+ (See VCS Support):\npip install git+https://github.com/tangentlabs/django-oscar-paypal.git@issue/34/oscar-0.6\n\nAnd specify the branch name without the leading /.\n"} {"input": "Circular (or cyclic) imports in Python What will happen if two modules import each other?\nTo generalize the problem, what about the cyclic imports in Python?\n", "output": "There was a really good discussion on this over at comp.lang.python last year. It answers your question pretty thoroughly.\n\nImports are pretty straightforward really. Just remember the following:\n'import' and 'from xxx import yyy' are executable statements. They execute\n when the running program reaches that line.\nIf a module is not in sys.modules, then an import creates the new module\n entry in sys.modules and then executes the code in the module. It does not\n return control to the calling module until the execution has completed.\nIf a module does exist in sys.modules then an import simply returns that\n module whether or not it has completed executing. That is the reason why\n cyclic imports may return modules which appear to be partly empty.\nFinally, the executing script runs in a module named __main__, importing\n the script under its own name will create a new module unrelated to\n __main__.\nTake that lot together and you shouldn't get any surprises when importing\n modules. \n\n"} {"input": "How do I watch a file for changes using Python? I have a log file being written by another process which I want to watch for changes. Each time a change occurrs I'd like to read the new data in to do some processing on it.\nWhat's the best way to do this? I was hoping there'd be some sort of hook from the PyWin32 library. I've found the win32file.FindNextChangeNotification function but have no idea how to ask it to watch a specific file.\nIf anyone's done anything like this I'd be really grateful to hear how...\n[Edit] I should have mentioned that I was after a solution that doesn't require polling.\n[Edit] Curses! It seems this doesn't work over a mapped network drive. I'm guessing windows doesn't 'hear' any updates to the file the way it does on a local disk.\n", "output": "Did you try using Watchdog?\n\nPython API library and shell utilities to monitor file system events.\nDirectory monitoring made easy with\n\nA cross-platform API.\nA shell tool to run commands in response to directory changes.\n\nGet started quickly with a simple example in Quickstart...\n\n"} {"input": "How do I resize an image using PIL and maintain its aspect ratio? Is there an obvious way to do this that I'm missing? I'm just trying to make thumbnails.\n", "output": "Define a maximum size.\nThen, compute a resize ratio by taking min(maxwidth/width, maxheight/height).\nThe proper size is oldsize*ratio.\nThere is of course also a library method to do this: the method Image.thumbnail.\nBelow is an (edited) example from the PIL documentation.\nimport os, sys\nimport Image\n\nsize = 128, 128\n\nfor infile in sys.argv[1:]:\n outfile = os.path.splitext(infile)[0] + \".thumbnail\"\n if infile != outfile:\n try:\n im = Image.open(infile)\n im.thumbnail(size, Image.ANTIALIAS)\n im.save(outfile, \"JPEG\")\n except IOError:\n print \"cannot create thumbnail for '%s'\" % infile\n\n"} {"input": "What does the Python Ellipsis object do? While idly surfing the namespace I noticed an odd looking object called \"Ellipsis\", it does not seem to be or do anything special, but it's a globally available builtin. \nAfter a search I found that it is used in some obscure variant of the slicing syntax by Numpy and Scipy... but almost nothing else. \nWas this object added to the language specifically to support Numpy + Scipy? Does Ellipsis have any generic meaning or use at all?\nD:\\workspace\\numpy>python\nPython 2.4.4 (#71, Oct 18 2006, 08:34:43) [MSC v.1310 32 bit (Intel)] on win32\nType \"help\", \"copyright\", \"credits\" or \"license\" for more information.\n>>> Ellipsis\nEllipsis\n\n", "output": "This came up in another question recently. I'll elaborate on my answer from there:\nEllipsis is an object that can appear in slice notation. For example:\nmyList[1:2, ..., 0]\n\nIts interpretation is purely up to whatever implements the __getitem__ function and sees Ellipsis objects there, but its main (and intended) use is in the numeric python extension, which adds a multidimensional array type. Since there are more than one dimensions, slicing becomes more complex than just a start and stop index; it is useful to be able to slice in multiple dimensions as well. E.g., given a 4x4 array, the top left area would be defined by the slice [:2,:2]:\n>>> a\narray([[ 1, 2, 3, 4],\n [ 5, 6, 7, 8],\n [ 9, 10, 11, 12],\n [13, 14, 15, 16]])\n\n>>> a[:2,:2] # top left\narray([[1, 2],\n [5, 6]])\n\nExtending this further, Ellipsis is used here to indicate a placeholder for the rest of the array dimensions not specified. Think of it as indicating the full slice [:] for all the dimensions in the gap it is placed, so for a 3d array, a[...,0] is the same as a[:,:,0] and for 4d, a[:,:,:,0], similarly, a[0,...,0] is a[0,:,:,0] (with however many colons in the middle make up the full number of dimensions in the array).\nInterestingly, in python3, the Ellipsis literal (...) is usable outside the slice syntax, so you can actually write: \n>>> ...\nEllipsis\n\nOther than the various numeric types, no, I don't think it's used. As far as I'm aware, it was added purely for numpy use and has no core support other than providing the object and corresponding syntax. The object being there didn't require this, but the literal \"...\" support for slices did.\n"} {"input": "Python access to first element in dictionary If dict is not empty, the way I use to access first element in dict is:\ndict[dict.keys()[0]]\n\nIs there any better way to do this?\n", "output": "Non-destructively you can:\ndict.itervalues().next()\n\nOn Python 3 this becomes\nnext (iter (dict.values()))\n\nIf you want it to work in both Python 2 & 3, you can use the six library:\nsix.next(six.itervalues(dict))\n\nthough at this point it is quite cryptic and I'd rather prefer your code.\nIf you want to remove any item, do:\nkey, value = dict.popitem()\n\nNote that \"first\" is not an appropriate term here. This is \"any\" item, because dict is not an ordered type.\n"} {"input": "Python List vs. Array - when to use? If you are creating a 1d array, you can implement it as a List, or else use the 'array' module in the standard library. I have always used Lists for 1d arrays.\nWhat is the reason or circumstance where I would want to use the array module instead?\nIs it for performance and memory optimization, or am I missing something obvious?\n", "output": "Basically, Python lists are very flexible and can hold completely heterogeneous, arbitrary data, and they can be appended to very efficiently, in amortized constant time. If you need to shrink and grow your array time-efficiently and without hassle, they are the way to go. But they use a lot more space than C arrays.\nThe array.array type, on the other hand, is just a thin wrapper on C arrays. It can hold only homogeneous data, all of the same type, and so it uses only sizeof(one object) * length bytes of memory. Mostly, you should use it when you need to expose a C array to an extension or a system call (for example, ioctl or fctnl). It's also a good way to represent a mutable string (array('B', bytes)) until that actually becomes available in Python 3.0.\nHowever, if you want to do math on a homogeneous array of numeric data, then you're much better off using NumPy, which can automatically vectorize operations on complex multi-dimensional arrays.\nTo make a long story short: array.array is useful when you need a homogeneous C array of data for reasons other than doing math.\n"} {"input": "How to find out if a Python object is a string? How can I check if a Python object is a string (either regular or Unicode)?\n", "output": "Python 2\nUse isinstance(obj, basestring) for an object-to-test obj.\nDocs.\n"} {"input": "How do I disable log messages from the Requests library? By default, the Requests library writes log messages to the console, along the lines of Starting new HTTP connection (1): example.com etc. I'm usually not interested in these messages, how do I disable them?\n", "output": "I found out how to configure requests's logging level, it's done via the standard logging module. I decided to configure it to not log messages unless they are at least warnings:\nimport logging\n\nlogging.getLogger(\"requests\").setLevel(logging.WARNING)\n\nIf you wish to apply this setting for the urllib3 library (typically used by requests) too, add the following:\nlogging.getLogger(\"urllib3\").setLevel(logging.WARNING)\n\n"} {"input": "Why is \"except: pass\" a bad programming practice? I often see comments on other Stack Overflow questions about how the use of except: pass is discouraged. Why is this bad? Sometimes I just don't care what the errors, are and I want to just continue with the code.\ntry:\n something\nexcept:\n pass\n\nWhy is using an except: pass block bad? What makes it bad? Is it the fact that I pass on an error or that I except any error?\n", "output": "The main problem here is that it ignores all and any error: Out of memory, CPU is burning, user wants to stop, program wants to exit, Jabberwocky is killing users.\nThis is way too much. In your head, you're thinking \"I want to ignore this network error\". If something unexpected goes wrong, then your code silently continues and breaks in completely unpredictable ways that no one can debug.\nThat's why you should limit yourself to ignoring specifically only some errors and let the rest pass.\n"} {"input": "Why doesn't Python have multiline comments? OK, I'm aware that triple-quotes strings can serve as multiline comments. For example,\n\"\"\"Hello, I am a \n multiline comment\"\"\"\n\nand \n'''Hello, I am a \n multiline comment'''\n\nBut technically speaking these are strings, correct?\nI've googled and read the Python style guide, but I was unable to find a technical answer to why there is no formal implementation of multiline, /* */ type of comments. I have no problem using triple quotes, but I am a little curious as to what led to this design decision.\n", "output": "I doubt you'll get a better answer than, \"Guido didn't feel the need for multi-line comments\". \nGuido has tweeted about this,\n\nPython tip: You can use multi-line strings as multi-line comments. Unless used as docstrings, they generate no code! :-)\n\n"} {"input": "Install a Python package into a different directory using pip? I know the obvious answer is to use virtualenv and virtualenvwrapper, but for various reasons I can't/don't want to do that.\nSo how do I modify the command\npip install package_name\n\nto make pip install the package somewhere other than the default site-packages?\n", "output": "Use:\npip install --install-option=\"--prefix=$PREFIX_PATH\" package_name\n\nYou might also want to use --ignore-installed to force all dependencies to be reinstalled using this new prefix. You can use --install-option to multiple times to add any of the options you can use with python setup.py install (--prefix is probably what you want, but there are a bunch more options you could use).\n"} {"input": "What are Class methods in Python for? I'm teaching myself Python and my most recent lesson was that Python is not Java, and so I've just spent a while turning all my Class methods into functions.\nI now realise that I don't need to use Class methods for what I would done with static methods in Java, but now I'm not sure when I would use them. All the advice I can find about Python Class methods is along the lines of newbies like me should steer clear of them, and the standard documentation is at its most opaque when discussing them.\nDoes anyone have a good example of using a Class method in Python or at least can someone tell me when Class methods can be sensibly used?\n", "output": "Class methods are for when you need to have methods that aren't specific to any particular instance, but still involve the class in some way. The most interesting thing about them is that they can be overridden by subclasses, something that's simply not possible in Java's static methods or Python's module-level functions.\nIf you have a class MyClass, and a module-level function that operates on MyClass (factory, dependency injection stub, etc), make it a classmethod. Then it'll be available to subclasses.\n"} {"input": "How can I check for Python version in a program that uses new language features? If I have a Python script that requires at least a particular\nversion of Python, what is the correct way to fail gracefully\nwhen an earlier version of Python is used to launch the script?\nHow do I get control early enough to issue an error message\nand exit?\nFor example, I have a program that uses the ternery operator (new in 2.5) and \"with\" blocks\n(new in 2.6). I wrote a simple little interpreter-version\nchecker routine which is the first thing the script would\ncall ... except it doesn't get that far. Instead, the\nscript fails during python compilation, before my routines\nare even called. Thus the user of the script sees some very\nobscure synax error tracebacks - which pretty much require\nan expert to deduce that it is simply the case of running\nthe wrong version of Python.\nI know how to check the version of Python. The issue is that some syntax is illegal in older versions of Python. Consider this program:\nimport sys\nif sys.version_info < (2, 4):\n raise \"must use python 2.5 or greater\"\nelse:\n # syntax error in 2.4, ok in 2.5\n x = 1 if True else 2\n print x\n\nWhen run under 2.4, I want this result\n$ ~/bin/python2.4 tern.py \nmust use python 2.5 or greater\n\nand not this result:\n$ ~/bin/python2.4 tern.py \n File \"tern.py\", line 5\n x = 1 if True else 2\n ^\nSyntaxError: invalid syntax\n\n(Channeling for a coworker.)\n", "output": "You can test using eval:\ntry:\n eval(\"1 if True else 2\")\nexcept SyntaxError:\n # doesn't have ternary\n\nAlso, with is available in Python 2.5, just add from __future__ import with_statement .\nEDIT: to get control early enough, you could split it do different .py files and check compatibility in the main file before importing (e.g. in __init__.py in a package):\n# __init__.py\n\n# Check compatibility\ntry:\n eval(\"1 if True else 2\")\nexcept SyntaxError:\n raise ImportError(\"requires ternary support\")\n\n# import from another module\nfrom impl import *\n\n"} {"input": "How do I calculate number of days betwen two dates using Python? If I have two dates (ex. '8/18/2008' and '9/26/2008') what is the best way to get the difference measured in days?\n", "output": "If you have two date objects, you can just subtract them.\nfrom datetime import date\n\nd0 = date(2008, 8, 18)\nd1 = date(2008, 9, 26)\ndelta = d0 - d1\nprint delta.days\n\nThe relevant section of the docs:\nhttps://docs.python.org/library/datetime.html\n"} {"input": "How can I install packages in my $HOME folder with pip? Is it possible? When installing pip, install the python packages inside my $HOME folder. (for example, I want to install mercurial, using pip, but inside $HOME instead of /usr/local)\nI'm with a mac machine and just thought about this possibility, instead of \"polluting\" my /usr/local, I would use my $HOME instead.\nPEP370 is exactly about this. Is just creating a \u00cb\u009c/.local and do a pip install package enough to make these packages to be installed only at my $HOME folder?\n", "output": "While you can use a virtualenv, you don't need to. The trick is passing the PEP370 --user argument to the setup.py script. With the latest version of pip, one way to do it is:\npip install --user mercurial\n\nThis should result in the hg script being installed in $HOME/.local/bin/hg and the rest of the hg package in $HOME/.local/lib/pythonx.y/site-packages/.\nNote, that the above is true for Python 2.6. There has been a bit of controversy among the Python core developers about what is the appropriate directory location on Mac OS X for PEP370-style user installations. In Python 2.7 and 3.2, the location on Mac OS X was changed from $HOME/.local to $HOME/Library/Python. This might change in a future release. But, for now, on 2.7 (and 3.2, if hg were supported on Python 3), the above locations will be $HOME/Library/Python/x.y/bin/hg and $HOME/Library/Python/x.y/lib/python/site-packages.\n"} {"input": "Is there any difference between \"foo is None\" and \"foo == None\"? Is there any difference between:\nif foo is None: pass\n\nand\nif foo == None: pass\n\nThe convention that I've seen in most Python code (and the code I myself write) is the former, but I recently came across code which uses the latter. None is an instance (and the only instance, IIRC) of NoneType, so it shouldn't matter, right? Are there any circumstances in which it might?\n", "output": "is always returns True if it compares the same object instance\nWhereas == is ultimately determined by the __eq__() method\ni.e.\n\n>>> class foo(object):\n def __eq__(self, other):\n return True\n\n>>> f = foo()\n>>> f == None\nTrue\n>>> f is None\nFalse\n\n"} {"input": "Understanding __get__ and __set__ and Python descriptors I am trying to understand what Python's descriptors are and what they can useful for. However, I am failing at it. I understand how they work, but here are my doubts. Consider the following code:\nclass Celsius(object):\n def __init__(self, value=0.0):\n self.value = float(value)\n def __get__(self, instance, owner):\n return self.value\n def __set__(self, instance, value):\n self.value = float(value)\n\n\nclass Temperature(object):\n celsius = Celsius()\n\n\nWhy do I need the descriptor class? Please explain using this example or the one you think is better. \nWhat is instance and owner here? (in __get__). So my question is, what is the purpose of the third parameter here?\nHow would I call/ use this example?\n\n", "output": "The descriptor is how Python's property type is implemented. A descriptor simply implements __get__, __set__, etc. and is then added to another class in its definition (as you did above with the Temperature class). For example:\ntemp=Temperature()\ntemp.celsius #calls celsius.__get__\n\nAccessing the property you assigned the descriptor to (celsius in the above example) calls the appropriate descriptor method.\ninstance in __get__ is the instance of the class (so above, __get__ would receive temp, while owner is the class with the descriptor (so it would be Temperature).\nYou need to use a descriptor class to encapsulate the logic that powers it. That way, if the descriptor is used to cache some expensive operation (for example), it could store the value on itself and not its class.\nAn article about descriptors can be found at http://martyalchin.com/2007/nov/23/python-descriptors-part-1-of-2/\nEDIT: As jchl pointed out in the comments, if you simply try Temperature.celsius, instance will be None.\n"} {"input": "Reloading submodules in IPython Currently I am working on a python project that contains sub modules and uses numpy/scipy. Ipython is used as interactive console. Unfortunately I am not very happy with workflow that I am using right now, I would appreciate some advice.\nIn IPython, the framework is loaded by a simple import command. However, it is often necessary to change code in one of the submodules of the framework. At this point a model is already loaded and I use IPython to interact with it. \nNow, the framework contains many modules that depend on each other, i.e. when the framework is initially loaded the main module is importing and configuring the submodules. The changes to the code are only executed if the module is reloaded using reload(main_mod.sub_mod). This is cumbersome as I need to reload all changed modules individually using the full path. It would be very convenient if reload(main_module) would also reload all sub modules, but without reloading numpy/scipy..\n", "output": "IPython comes with some automatic reloading magic:\n%load_ext autoreload\n%autoreload 2\n\nIt will reload all changed modules every time before executing a new line. The way this works is slightly different than dreload. Some caveats apply, type %autoreload? to see what can go wrong.\n\nIf you want to always enable this settings, modify your IPython configuration file ~/.ipython/profile_default/ipython_config.py[1] and appending:\nc.InteractiveShellApp.extensions = ['autoreload'] \nc.InteractiveShellApp.exec_lines = ['%autoreload 2']\n\nCredit to @Kos via a comment below.\n[1]\nIf you don't have the file ~/.ipython/profile_default/ipython_config.py, you need to call ipython profile create first. Or the file may be located at $IPYTHONDIR.\n"} {"input": "Class method differences in Python: bound, unbound and static What is the difference between the following class methods?\nIs it that one is static and the other is not?\nclass Test(object):\n def method_one(self):\n print \"Called method_one\"\n\n def method_two():\n print \"Called method_two\"\n\na_test = Test()\na_test.method_one()\na_test.method_two()\n\n", "output": "In Python, there is a distinction between bound and unbound methods. \nBasically, a call to a member function (like method_one), a bound function\na_test.method_one()\n\nis translated to\nTest.method_one(a_test)\n\ni.e. a call to an unbound method. Because of that, a call to your version of method_two will fail with a TypeError\n>>> a_test = Test() \n>>> a_test.method_two()\nTraceback (most recent call last):\n File \"\", line 1, in \nTypeError: method_two() takes no arguments (1 given)\n\nYou can change the behavior of a method using a decorator\nclass Test(object):\n def method_one(self):\n print \"Called method_one\"\n\n @staticmethod\n def method_two():\n print \"Called method two\"\n\nThe decorator tells the built-in default metaclass type (the class of a class, cf. this question) to not create bound methods for method_two.\nNow, you can invoke static method both on an instance or on the class directly:\n>>> a_test = Test()\n>>> a_test.method_one()\nCalled method_one\n>>> a_test.method_two()\nCalled method_two\n>>> Test.method_two()\nCalled method_two\n\n"} {"input": "How do I test one variable against multiple values? I'm trying to make a function that will compare multiple variables to an integer and output a string of three letters. I was wondering if there was a way to translate this into Python. So say:\nx = 0\ny = 1\nz = 3\nMylist = []\n\nif x or y or z == 0 :\n Mylist.append(\"c\")\nelif x or y or z == 1 :\n Mylist.append(\"d\")\nelif x or y or z == 2 :\n Mylist.append(\"e\")\nelif x or y or z == 3 : \n Mylist.append(\"f\")\n\nwhich would return a list of \n[\"c\", \"d\", \"f\"]\n\nIs something like this possible?\n", "output": "You misunderstand how boolean expressions work; they don't work like an English sentence and guess that you are talking about the same comparison for all names here. You are looking for:\nif x == 1 or y == 1 or z == 1:\n\nx and y are otherwise evaluated on their own (False if 0, True otherwise).\nYou can shorten that to:\nif 1 in (x, y, z):\n\nor better still:\nif 1 in {x, y, z}:\n\nusing a set to take advantage of the constant-cost membership test (in takes a fixed amount of time whatever the left-hand operand is).\nWhen you use or, python sees each side of the operator as separate expressions. The expression x or y == 1 is treated as first a boolean test for x, then if that is False, the expression y == 1 is tested.\nThis is due to operator precedence. The or operator has a lower precedence than the == test, so the latter is evaluated first.\nHowever, even if this were not the case, and the expression x or y or z == 1 was actually interpreted as (x or y or z) == 1 instead, this would still not do what you expect it to do. \nx or y or z would evaluate to the first argument that is 'truthy', e.g. not False, numeric 0 or empty (see boolean expressions for details on what Python considers false in a boolean context).\nSo for the values x = 2; y = 1; z = 0, x or y or z would resolve to 2, because that is the first true-like value in the arguments. Then 2 == 1 would be False, even though y == 1 would be True.\n"} {"input": "Asking the user for input until they give a valid response I am writing a program that must accept input from the user.\n#note: Python 2.7 users should use `raw_input`, the equivalent of 3.X's `input`\nage = int(input(\"Please enter your age: \"))\nif age >= 18: \n print(\"You are able to vote in the United States!\")\nelse:\n print(\"You are not able to vote in the United States.\")\n\nThis works as expected if the user enters sensible data. \nC:\\Python\\Projects> canyouvote.py\nPlease enter your age: 23\nYou are able to vote in the United States!\n\nBut if they make a mistake, then it crashes:\nC:\\Python\\Projects> canyouvote.py\nPlease enter your age: dickety six\nTraceback (most recent call last):\n File \"canyouvote.py\", line 1, in \n age = int(input(\"Please enter your age: \"))\nValueError: invalid literal for int() with base 10: 'dickety six'\n\nInstead of crashing, I would like it to try getting the input again. Like this:\nC:\\Python\\Projects> canyouvote.py\nPlease enter your age: dickety six\nSorry, I didn't understand that.\nPlease enter your age: 26\nYou are able to vote in the United States!\n\nHow can I accomplish this? What if I also wanted to reject values like -1, which is a valid int, but nonsensical in this context?\n", "output": "The simplest way to accomplish this would be to put the input method in a while loop. Use continue when you get bad input, and break out of the loop when you're satisfied.\nWhen Your Input Might Raise an Exception\nUse try and catch to detect when the user enters data that can't be parsed.\nwhile True:\n try:\n # Note: Python 2.x users should use raw_input, the equivalent of 3.x's input\n age = int(input(\"Please enter your age: \"))\n except ValueError:\n print(\"Sorry, I didn't understand that.\")\n #better try again... Return to the start of the loop\n continue\n else:\n #age was successfully parsed!\n #we're ready to exit the loop.\n break\nif age >= 18: \n print(\"You are able to vote in the United States!\")\nelse:\n print(\"You are not able to vote in the United States.\")\n\nImplementing Your Own Validation Rules\nIf you want to reject values that Python can successfully parse, you can add your own validation logic.\nwhile True:\n data = input(\"Please enter a loud message (must be all caps): \")\n if not data.isupper():\n print(\"Sorry, your response was not loud enough.\")\n continue\n else:\n #we're happy with the value given.\n #we're ready to exit the loop.\n break\n\nwhile True:\n data = input(\"Pick an answer from A to D:\")\n if data.lower() not in ('a', 'b', 'c', 'd'):\n print(\"Not an appropriate choice.\")\n else:\n break\n\nCombining Exception Handling and Custom Validation\nBoth of the above techniques can be combined into one loop.\nwhile True:\n try:\n age = int(input(\"Please enter your age: \"))\n except ValueError:\n print(\"Sorry, I didn't understand that.\")\n continue\n\n if age < 0:\n print(\"Sorry, your response must not be negative.\")\n continue\n else:\n #age was successfully parsed, and we're happy with its value.\n #we're ready to exit the loop.\n break\nif age >= 18: \n print(\"You are able to vote in the United States!\")\nelse:\n print(\"You are not able to vote in the United States.\")\n\nEncapsulating it All in a Function\nIf you need to ask your user for a lot of different values, it might be useful to put this code in a function, so you don't have to retype it every time.\ndef get_non_negative_int(prompt):\n while True:\n try:\n value = int(input(prompt))\n except ValueError:\n print(\"Sorry, I didn't understand that.\")\n continue\n\n if value < 0:\n print(\"Sorry, your response must not be negative.\")\n continue\n else:\n break\n return value\n\nage = get_non_negative_int(\"Please enter your age: \")\nkids = get_non_negative_int(\"Please enter the number of children you have: \")\nsalary = get_non_negative_int(\"Please enter your yearly earnings, in dollars: \")\n\nPutting it all together\nYou can extend this idea to make a very generic input function:\ndef sanitised_input(prompt, type_=None, min_=None, max_=None, range_=None): \n if min_ is not None and max_ is not None and max_ < min_: \n raise ValueError(\"min_ must be less than or equal to max_.\") \n while True: \n ui = input(prompt) \n if type_ is not None: \n try: \n ui = type_(ui) \n except ValueError: \n print(\"Input type must be {0}.\".format(type_.__name__)) \n continue\n if max_ is not None and ui > max_: \n print(\"Input must be less than or equal to {0}.\".format(max_)) \n elif min_ is not None and ui < min_: \n print(\"Input must be greater than or equal to {0}.\".format(min_)) \n elif range_ is not None and ui not in range_: \n if isinstance(range_, range): \n template = \"Input must be between {0.start} and {0.stop}.\"\n print(template.format(range_)) \n else: \n template = \"Input must be {0}.\"\n if len(range_) == 1: \n print(template.format(*range_)) \n else: \n print(template.format(\" or \".join((\", \".join(map(str, \n range_[:-1])), \n str(range_[-1]))))) \n else: \n return ui \n\nWith usage such as:\nage = sanitised_input(\"Enter your age: \", int, 1, 101)\nanswer = sanitised_input(\"Enter your answer\", str.lower, range_=('a', 'b', 'c', 'd'))\n\nCommon Pitfalls, and Why you Should Avoid Them\nThe Redundant Use of Redundant input Statements\nThis method works but is generally considered poor style:\ndata = input(\"Please enter a loud message (must be all caps): \")\nwhile not data.isupper():\n print(\"Sorry, your response was not loud enough.\")\n data = input(\"Please enter a loud message (must be all caps): \")\n\nIt might look attractive initially because it's shorter than the while True method, but it violates the Don't Repeat Yourself principle of software development. This increases the likelihood of bugs in your system. What if you want to backport to 2.7 by changing input to raw_input, but accidentally change only the first input above? It's a SyntaxError just waiting to happen.\nRecursion Will Blow Your Stack\nIf you've just learned about recursion, you might be tempted to use it in get_non_negative_int so you can dispose of the while loop.\ndef get_non_negative_int(prompt):\n try:\n value = int(input(prompt))\n except ValueError:\n print(\"Sorry, I didn't understand that.\")\n return get_non_negative_int(prompt)\n\n if value < 0:\n print(\"Sorry, your response must not be negative.\")\n return get_non_negative_int(prompt)\n else:\n return value\n\nThis appears to work fine most of the time, but if the user enters invalid data enough times, the script will terminate with a RuntimeError: maximum recursion depth exceeded. You may think \"no fool would make 1000 mistakes in a row\", but you're underestimating the ingenuity of fools!\n"} {"input": "Is there a Numpy function to return the first index of something in an array? I know there is a method for python list to return the first index of something\nl = list(1,2,3)\nl.index(2)\n>>> 1\n\nIs there something like that for numpy arrays?\n", "output": "Yes, here is the answer given a Numpy array, array, and a value, item, to search for.\nitemindex = numpy.where(array==item)\n\nThe result is a tuple with first all the row indices, then all the column indices.\nFor example if array is two dimensions and it contained your item at two locations then\narray[itemindex[0][0]][itemindex[1][0]]\n\nwould be equal to your item and so would\narray[itemindex[0][1]][itemindex[1][1]]\n\nnumpy.where\n"} {"input": "How to copy a dictionary and only edit the copy Can someone please explain this to me? This doesn't make any sense to me.\nI copy a dictionary into another and edit the second and both are changed. Why is this happening?\n>>> dict1 = {\"key1\": \"value1\", \"key2\": \"value2\"}\n>>> dict2 = dict1\n>>> dict2\n{'key2': 'value2', 'key1': 'value1'}\n>>> dict2[\"key2\"] = \"WHY?!\"\n>>> dict1\n{'key2': 'WHY?!', 'key1': 'value1'}\n\n", "output": "Python never implicitly copies objects. When you set dict2 = dict1, you are making them refer to the same exact dict object, so when you mutate it, all references to it keep referring to the object in its current state.\nIf you want to copy the dict (which is rare), you have to do so explicitly with\ndict2 = dict(dict1)\n\nor\ndict2 = dict1.copy()\n\n"} {"input": "Empty set literal in Python? [] = empty list\n() = empty tuple\n{} = empty dict\nIs there a similar notation for an empty set?\nOr do I have to write set()?\n", "output": "No, there's no literal syntax for the empty set. You have to write set().\n"} {"input": "Django - what is the difference between render(), render_to_response() and direct_to_template()? Whats the difference (in language a python/django noob can understand) in a view between render(), render_to_response() and direct_to_template()?\ne.g. from Nathan Borror's basic apps examples\ndef comment_edit(request, object_id, template_name='comments/edit.html'):\n comment = get_object_or_404(Comment, pk=object_id, user=request.user)\n # ...\n return render(request, template_name, {\n 'form': form,\n 'comment': comment,\n })\n\nBut I've also seen\n return render_to_response(template_name, my_data_dictionary,\n context_instance=RequestContext(request))\n\nAnd\n return direct_to_template(request, template_name, my_data_dictionary)\n\nWhats the difference, what to use in any particular situation?\n", "output": "https://docs.djangoproject.com/en/1.8/topics/http/shortcuts/#render\nrender(request, template[, dictionary][, context_instance][, content_type][, status][, current_app])\n\nrender() is a brand spanking new shortcut for render_to_response in 1.3 that will automatically use RequestContext that I will most definitely be using from now on.\n\nhttps://docs.djangoproject.com/en/1.8/topics/http/shortcuts/#render-to-response\nrender_to_response(template[, dictionary][, context_instance][, mimetype])\u00c2\u00b6\n\nrender_to_response is your standard render function used in the tutorials and such. To use RequestContext you'd have to specify context_instance=RequestContext(request)\n\nhttps://docs.djangoproject.com/en/1.8/ref/generic-views/#django-views-generic-simple-direct-to-template\ndirect_to_template is a generic view that I use in my views (as opposed to in my urls) because like the new render() function, it automatically uses RequestContext and all its context_processors.\nBut direct_to_template should be avoided as function based generic views are deprecated. Either use render or an actual class, see https://docs.djangoproject.com/en/1.3/topics/generic-views-migration/\nI'm happy I haven't typed RequestContext in a long, long time. \n"} {"input": "Unicode (utf8) reading and writing to files in python I'm having some brain failure in understanding reading and writing text to a file (Python 2.4).\n# the string, which has an a-acute in it.\nss = u'Capit\\xe1n'\nss8 = ss.encode('utf8')\nrepr(ss), repr(ss8)\n\n\n(\"u'Capit\\xe1n'\", \"'Capit\\xc3\\xa1n'\")\n\nprint ss, ss8 \nprint >> open('f1','w'), ss8\n\n>>> file('f1').read() \n'Capit\\xc3\\xa1n\\n'\n\nSo I type in Capit\\xc3\\xa1n into my favorite editor, in file f2.\nthen:\n>>> open('f1').read()\n'Capit\\xc3\\xa1n\\n'\n>>> open('f2').read()\n'Capit\\\\xc3\\\\xa1n\\n'\n>>> open('f1').read().decode('utf8')\nu'Capit\\xe1n\\n'\n>>> open('f2').read().decode('utf8')\nu'Capit\\\\xc3\\\\xa1n\\n'\n\nWhat am I not understanding here? Clearly there is some vital bit of magic (or good sense) that I'm missing. What does one type into text files to get proper conversions.\nEdit: What I'm truly failing to grok here, is what the point of the UTF-8 representation is, if you can't actually get Python to recognize it, when it comes from outside. Maybe I should just JSON dump the string, and use that instead, since that has an asciiable representation! More to the point, is there an ascii representation of this unicode object that Python will recognize and decode, when coming in from a file? If so, how do I get it?\n>>> print simplejson.dumps(ss)\n'\"Capit\\u00e1n\"'\n>>> print >> file('f3','w'), simplejson.dumps(ss)\n>>> simplejson.load(open('f3'))\nu'Capit\\xe1n'\n\n", "output": "Rather than mess with the encode, decode methods I find it easier to use the open method from the codecs module.\n>>>import codecs\n>>>f = codecs.open(\"test\", \"r\", \"utf-8\")\n\nThen after calling f's read() function, an encoded unicode object is returned.\n>>>f.read()\nu'Capit\\xe1l\\n\\n'\n\nIf you know the encoding of a file, using the codecs package is going to be much less confusing.\nSee http://docs.python.org/library/codecs.html#codecs.open\n"} {"input": "Create List of Single Item Repeated n Times in Python I know a list comprehension will do this, but I was wondering if there is an even shorter (and more Pythonic?) approach.\nI want to create a series of lists, all of varying length. Each list will contain the same element e, repeated n times (where n = length of the list). How do I create the lists, without doing\n[e for number in xrange(n)]\n\nfor each list?\n", "output": "You can also write:\n[e] * n\n\nYou should note that if e is for example an empty list you get a list with n references to the same list, not n independent empty lists. \nPerformance testing\nAt first glance it seems that repeat is the fastest way to create a list with n identical elements:\n>>> timeit.timeit('itertools.repeat(0, 10)', 'import itertools', number = 1000000)\n0.37095273281943264\n>>> timeit.timeit('[0] * 10', 'import itertools', number = 1000000)\n0.5577236771712819\n\nBut wait - it's not a fair test...\n>>> itertools.repeat(0, 10)\nrepeat(0, 10) # Not a list!!!\n\nThe function itertools.repeat doesn't actually create the list, it just creates an object that can be used to create a list if you wish! Let's try that again, but converting to a list:\n>>> timeit.timeit('list(itertools.repeat(0, 10))', 'import itertools', number = 1000000)\n1.7508119747063233\n\nSo if you want a list, use [e] * n. If you want to generate the elements lazily, use repeat.\n"} {"input": "How can I represent an infinite number in Python? In python, when you want to give to a set of elements an associated value, and you use this value for comparisons between them, I would want this value as infinite. No matter which number you enter in the program, no number will be greater than this representation of infinity.\n", "output": "In Python, you can do:\ntest = float(\"inf\")\n\nIn Python 3.5, you can do:\nimport math\ntest = math.inf\n\nAnd then:\ntest > 1\ntest > 10000\ntest > x\n\nWill always be true. Unless of course, as pointed out, x is also infinity or \"nan\" (\"not a number\").\nAdditionally (Python 2.x ONLY), in a comparison to Ellipsis, float(inf) is lesser, e.g:\nfloat('inf') < Ellipsis\n\nwould return true.\n"} {"input": "What is the difference between Python's re.search and re.match? What is the difference between the search() and match() functions in the Python re module?\nI've read the documentation (current documentation), but I never seem to remember it. I keep having to look it up and re-learn it. I'm hoping that someone will answer it clearly with examples so that (perhaps) it will stick in my head. Or at least I'll have a better place to return with my question and it will take less time to re-learn it.\n", "output": "re.match is anchored at the beginning of the string. That has nothing to do with newlines, so it is not the same as using ^ in the pattern.\nAs re.match documentation says:\n\nIf zero or more characters at the\n beginning of string match the regular expression pattern, return a\n corresponding MatchObject instance.\n Return None if the string does not\n match the pattern; note that this is\n different from a zero-length match.\nNote: If you want to locate a match\n anywhere in string, use search()\n instead.\n\nre.search searches the entire string, as the documentation says:\n\nScan through string looking for a\n location where the regular expression\n pattern produces a match, and return a\n corresponding MatchObject instance.\n Return None if no position in the\n string matches the pattern; note that\n this is different from finding a\n zero-length match at some point in the\n string.\n\nSo if you need to match at the beginning of the string, or to match the entire string use match. It is faster. Otherwise use search.\nThe documentation has a specific section for match vs. search that also covers multiline strings:\n\nPython offers two different primitive\n operations based on regular\n expressions: match checks for a match\n only at the beginning of the string,\n while search checks for a match\n anywhere in the string (this is what\n Perl does by default).\nNote that match may differ from search\n even when using a regular expression\n beginning with '^': '^' matches only\n at the start of the string, or in\n MULTILINE mode also immediately\n following a newline. The \u00e2\u0080\u009cmatch\u00e2\u0080\u009d\n operation succeeds only if the pattern\n matches at the start of the string\n regardless of mode, or at the starting\n position given by the optional pos\n argument regardless of whether a\n newline precedes it.\n\nNow, enough talk. Time to some example code:\n# example code:\nstring_with_newlines = \"\"\"something\nsomeotherthing\"\"\"\n\nimport re\n\nprint re.match('some', string_with_newlines) # matches\nprint re.match('someother', \n string_with_newlines) # won't match\nprint re.match('^someother', string_with_newlines, \n re.MULTILINE) # also won't match\nprint re.search('someother', \n string_with_newlines) # finds something\nprint re.search('^someother', string_with_newlines, \n re.MULTILINE) # also finds something\n\nm = re.compile('thing$', re.MULTILINE)\n\nprint m.match(string_with_newlines) # no match\nprint m.match(string_with_newlines, pos=4) # matches\nprint m.search(string_with_newlines, \n re.MULTILINE) # also matches\n\n"} {"input": "Get key by value in dictionary I made a function which will look up ages in dictionary and show the matching name:\nlist = {'george':16,'amber':19}\nsearch_age = raw_input(\"Provide age\")\nfor age in list.values():\n if age == search_age:\n name = list[age]\n print name\n\nI know how to compare and find the age I just don't know how to show the name of the person. Additionally, I am getting a KeyError because of line 5. I know it's not correct but I can't figure out to make it search backwards.\n", "output": "mydict = {'george':16,'amber':19}\nprint mydict.keys()[mydict.values().index(16)] # Prints george\n\nOr in Python 3.x:\nmydict = {'george':16,'amber':19}\nprint(list(mydict.keys())[list(mydict.values()).index(16)]) # Prints george\n\nBasically, it separates the dictionary's values in a list, finds the position of the value you have, and gets the key at that position.\nMore about keys() and .values() in Python 3: Python: simplest way to get list of values from dict?\n"} {"input": "Representing and solving a maze given an image What is the best way to represent and solve a maze given an image?\n\nGiven an JPEG image (as seen above), what's the best way to read it in, parse it into some data structure and solve the maze? My first instinct is to read the image in pixel by pixel and store it in a list (array) of boolean values: True for a white pixel, and False for a non-white pixel (the colours can be discarded). The issue with this method, is that the image may not be \"pixel perfect\". By that I simply mean that if there is a white pixel somewhere on a wall it may create an unintended path.\nAnother method (which came to me after a bit of thought) is to convert the image to an SVG file - which is a list of paths drawn on a canvas. This way, the paths could be read into the same sort of list (boolean values) where True indicates a path or wall, False indicating a travel-able space. An issue with this method arises if the conversion is not 100% accurate, and does not fully connect all of the walls, creating gaps.\nAlso an issue with converting to SVG is that the lines are not \"perfectly\" straight. This results in the paths being cubic bezier curves. With a list (array) of boolean values indexed by integers, the curves would not transfer easily, and all the points that line on the curve would have to be calculated, but won't exactly match to list indices.\nI assume that while one of these methods may work (though probably not) that they are woefully inefficient given such a large image, and that there exists a better way. How is this best (most efficiently and/or with the least complexity) done? Is there even a best way?\nThen comes the solving of the maze. If I use either of the first two methods, I will essentially end up with a matrix. According to this answer, a good way to represent a maze is using a tree, and a good way to solve it is using the A* algorithm. How would one create a tree from the image? Any ideas?\nTL;DR\nBest way to parse? Into what data structure? How would said structure help/hinder solving?\nUPDATE\nI've tried my hand at implementing what @Mikhail has written in Python, using numpy, as @Thomas recommended. I feel that the algorithm is correct, but it's not working as hoped. (Code below.) The PNG library is PyPNG.\nimport png, numpy, Queue, operator, itertools\n\ndef is_white(coord, image):\n \"\"\" Returns whether (x, y) is approx. a white pixel.\"\"\"\n a = True\n for i in xrange(3):\n if not a: break\n a = image[coord[1]][coord[0] * 3 + i] > 240\n return a\n\ndef bfs(s, e, i, visited):\n \"\"\" Perform a breadth-first search. \"\"\"\n frontier = Queue.Queue()\n while s != e:\n for d in [(-1, 0), (0, -1), (1, 0), (0, 1)]:\n np = tuple(map(operator.add, s, d))\n if is_white(np, i) and np not in visited:\n frontier.put(np)\n visited.append(s)\n s = frontier.get()\n return visited\n\ndef main():\n r = png.Reader(filename = \"thescope-134.png\")\n rows, cols, pixels, meta = r.asDirect()\n assert meta['planes'] == 3 # ensure the file is RGB\n image2d = numpy.vstack(itertools.imap(numpy.uint8, pixels))\n start, end = (402, 985), (398, 27)\n print bfs(start, end, image2d, [])\n\n", "output": "Here is a solution.\n\nConvert image to grayscale (not yet binary), adjusting weights for the colors so that final grayscale image is approximately uniform. You can do it simply by controlling sliders in Photoshop in Image -> Adjustments -> Black & White.\nConvert image to binary by setting appropriate threshold in Photoshop in Image -> Adjustments -> Threshold.\nMake sure threshold is selected right. Use the Magic Wand Tool with 0 tolerance, point sample, contiguous, no anti-aliasing. Check that edges at which selection breaks are not false edges introduced by wrong threshold. In fact, all interior points of this maze are accessible from the start.\nAdd artificial borders on the maze to make sure virtual traveler will not walk around it :)\nImplement breadth-first search (BFS) in your favorite language and run it from the start. I prefer MATLAB for this task. As @Thomas already mentioned, there is no need to mess with regular representation of graphs. You can work with binarized image directly.\n\nHere is the MATLAB code for BFS:\nfunction path = solve_maze(img_file)\n %% Init data\n img = imread(img_file);\n img = rgb2gray(img);\n maze = img > 0;\n start = [985 398];\n finish = [26 399];\n\n %% Init BFS\n n = numel(maze);\n Q = zeros(n, 2);\n M = zeros([size(maze) 2]);\n front = 0;\n back = 1;\n\n function push(p, d)\n q = p + d;\n if maze(q(1), q(2)) && M(q(1), q(2), 1) == 0\n front = front + 1;\n Q(front, :) = q;\n M(q(1), q(2), :) = reshape(p, [1 1 2]);\n end\n end\n\n push(start, [0 0]);\n\n d = [0 1; 0 -1; 1 0; -1 0];\n\n %% Run BFS\n while back <= front\n p = Q(back, :);\n back = back + 1;\n for i = 1:4\n push(p, d(i, :));\n end\n end\n\n %% Extracting path\n path = finish;\n while true\n q = path(end, :);\n p = reshape(M(q(1), q(2), :), 1, 2);\n path(end + 1, :) = p;\n if isequal(p, start) \n break;\n end\n end\nend\n\nIt is really very simple and standard, there should not be difficulties on implementing this in Python or whatever.\nAnd here is the answer:\n\n"} {"input": "What is the standard way to add N seconds to datetime.time in Python? Given a datetime.time value in Python, is there a standard way to add an integer number of seconds to it, so that 11:34:59 + 3 = 11:35:02, for example?\nThese obvious ideas don't work:\n>>> datetime.time(11, 34, 59) + 3\nTypeError: unsupported operand type(s) for +: 'datetime.time' and 'int'\n>>> datetime.time(11, 34, 59) + datetime.timedelta(0, 3)\nTypeError: unsupported operand type(s) for +: 'datetime.time' and 'datetime.timedelta'\n>>> datetime.time(11, 34, 59) + datetime.time(0, 0, 3)\nTypeError: unsupported operand type(s) for +: 'datetime.time' and 'datetime.time'\n\nIn the end I have written functions like this:\ndef add_secs_to_time(timeval, secs_to_add):\n secs = timeval.hour * 3600 + timeval.minute * 60 + timeval.second\n secs += secs_to_add\n return datetime.time(secs // 3600, (secs % 3600) // 60, secs % 60)\n\nI can't help thinking that I'm missing an easier way to do this though.\nRelated\n\npython time + timedelta equivalent\n\n", "output": "You can use full datetime variables with timedelta, and by providing a dummy date then using time to just get the time value.\nFor example:\nimport datetime\na = datetime.datetime(100,1,1,11,34,59)\nb = a + datetime.timedelta(0,3) # days, seconds, then other fields.\nprint a.time()\nprint b.time()\n\nresults in the two values, three seconds apart:\n11:34:59\n11:35:02\n\nYou could also opt for the more readable\nb = a + datetime.timedelta(seconds=3)\n\nif you're so inclined.\n\nIf you're after a function that can do this, you can look into using addSecs below:\nimport datetime\n\ndef addSecs(tm, secs):\n fulldate = datetime.datetime(100, 1, 1, tm.hour, tm.minute, tm.second)\n fulldate = fulldate + datetime.timedelta(seconds=secs)\n return fulldate.time()\n\na = datetime.datetime.now().time()\nb = addSecs(a, 300)\nprint a\nprint b\n\nThis outputs:\n 09:11:55.775695\n 09:16:55\n\n"} {"input": "Removing a list of characters in string I want to remove characters in a string in python:\nstring.replace(',', '').replace(\"!\", '').replace(\":\", '').replace(\";\", '')...\n\nBut I have many characters I have to remove. I thought about a list\nlist = [',', '!', '.', ';'...]\n\nBut how can I use the list to replace the characters in the string?\n", "output": "If you're using python2 and your inputs are strings (not unicodes), the absolutely best method is str.translate:\n>>> chars_to_remove = ['.', '!', '?']\n>>> subj = 'A.B!C?'\n>>> subj.translate(None, ''.join(chars_to_remove))\n'ABC'\n\nOtherwise, there are following options to consider:\nA. Iterate the subject char by char, omit unwanted characters and join the resulting list:\n>>> sc = set(chars_to_remove)\n>>> ''.join([c for c in subj if c not in sc])\n'ABC'\n\n(Note that the generator version ''.join(c for c ...) will be less efficient).\nB. Create a regular expression on the fly and re.sub with an empty string:\n>>> import re\n>>> rx = '[' + re.escape(''.join(chars_to_remove)) + ']'\n>>> re.sub(rx, '', subj)\n'ABC'\n\n(re.escape ensures that characters like ^ or ] won't break the regular expression).\nC. Use the mapping variant of translate:\n>>> chars_to_remove = [u'\u00ce\u00b4', u'\u00ce\u0093', u'\u00d0\u00b6']\n>>> subj = u'A\u00d0\u00b6B\u00ce\u00b4C\u00ce\u0093'\n>>> dd = {ord(c):None for c in chars_to_remove}\n>>> subj.translate(dd)\nu'ABC'\n\nFull testing code and timings:\n#coding=utf8\n\nimport re\n\ndef remove_chars_iter(subj, chars):\n sc = set(chars)\n return ''.join([c for c in subj if c not in sc])\n\ndef remove_chars_re(subj, chars):\n return re.sub('[' + re.escape(''.join(chars)) + ']', '', subj)\n\ndef remove_chars_re_unicode(subj, chars):\n return re.sub(u'(?u)[' + re.escape(''.join(chars)) + ']', '', subj)\n\ndef remove_chars_translate_bytes(subj, chars):\n return subj.translate(None, ''.join(chars))\n\ndef remove_chars_translate_unicode(subj, chars):\n d = {ord(c):None for c in chars}\n return subj.translate(d)\n\nimport timeit, sys\n\ndef profile(f):\n assert f(subj, chars_to_remove) == test\n t = timeit.timeit(lambda: f(subj, chars_to_remove), number=1000)\n print ('{0:.3f} {1}'.format(t, f.__name__))\n\nprint (sys.version)\nPYTHON2 = sys.version_info[0] == 2\n\nprint ('\\n\"plain\" string:\\n')\n\nchars_to_remove = ['.', '!', '?']\nsubj = 'A.B!C?' * 1000\ntest = 'ABC' * 1000\n\nprofile(remove_chars_iter)\nprofile(remove_chars_re)\n\nif PYTHON2:\n profile(remove_chars_translate_bytes)\nelse:\n profile(remove_chars_translate_unicode)\n\nprint ('\\nunicode string:\\n')\n\nif PYTHON2:\n chars_to_remove = [u'\u00ce\u00b4', u'\u00ce\u0093', u'\u00d0\u00b6']\n subj = u'A\u00d0\u00b6B\u00ce\u00b4C\u00ce\u0093'\nelse:\n chars_to_remove = ['\u00ce\u00b4', '\u00ce\u0093', '\u00d0\u00b6']\n subj = 'A\u00d0\u00b6B\u00ce\u00b4C\u00ce\u0093'\n\nsubj = subj * 1000\ntest = 'ABC' * 1000\n\nprofile(remove_chars_iter)\n\nif PYTHON2:\n profile(remove_chars_re_unicode)\nelse:\n profile(remove_chars_re)\n\nprofile(remove_chars_translate_unicode)\n\nResults:\n2.7.5 (default, Mar 9 2014, 22:15:05) \n[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)]\n\n\"plain\" string:\n\n0.637 remove_chars_iter\n0.649 remove_chars_re\n0.010 remove_chars_translate_bytes\n\nunicode string:\n\n0.866 remove_chars_iter\n0.680 remove_chars_re_unicode\n1.373 remove_chars_translate_unicode\n\n---\n\n3.4.2 (v3.4.2:ab2c023a9432, Oct 5 2014, 20:42:22) \n[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]\n\n\"plain\" string:\n\n0.512 remove_chars_iter\n0.574 remove_chars_re\n0.765 remove_chars_translate_unicode\n\nunicode string:\n\n0.817 remove_chars_iter\n0.686 remove_chars_re\n0.876 remove_chars_translate_unicode\n\n(As a side note, the figure for remove_chars_translate_bytes might give us a clue why the industry was reluctant to adopt Unicode for such a long time).\n"} {"input": "Plot logarithmic axes with matplotlib in python I want to plot a graph with one logarithmic axis using matplotlib. \nI've been reading the docs, but can't figure out the syntax. I know that it's probably something simple like 'scale=linear' in the plot arguments, but I can't seem to get it right\nSample program: \nfrom pylab import *\nimport matplotlib.pyplot as pyplot\na = [ pow(10,i) for i in range(10) ]\nfig = pyplot.figure()\nax = fig.add_subplot(2,1,1)\n\nline, = ax.plot(a, color='blue', lw=2)\nshow()\n\n", "output": "You can use the Axes.set_yscale method. That allows you to change the scale after the Axes object is created. That would also allow you to build a control to let the user pick the scale if you needed to.\nThe relevant line to add is:\nax.set_yscale('log')\n\nYou can use 'linear' to switch back to a linear scale. Here's what your code would look like:\nfrom pylab import *\nimport matplotlib.pyplot as pyplot\na = [ pow(10,i) for i in range(10) ]\nfig = pyplot.figure()\nax = fig.add_subplot(2,1,1)\n\nline, = ax.plot(a, color='blue', lw=2)\n\nax.set_yscale('log')\n\nshow()\n\n"} {"input": "PyPy -- How can it possibly beat CPython? From the Google Open Source Blog:\n\nPyPy is a reimplementation of Python\n in Python, using advanced techniques\n to try to attain better performance\n than CPython. Many years of hard work\n have finally paid off. Our speed\n results often beat CPython, ranging\n from being slightly slower, to\n speedups of up to 2x on real\n application code, to speedups of up to\n 10x on small benchmarks.\n\nHow is this possible? Which Python implementation was used to implement PyPy? CPython? And what are the chances of a PyPyPy or PyPyPyPy beating their score?\n(On a related note... why would anyone try something like this?)\n", "output": "\"PyPy is a reimplementation of Python in Python\" is a rather misleading way to describe PyPy, IMHO, although it's technically true.\nThere are two major parts of PyPy.\n\nThe translation framework\nThe interpreter\n\nThe translation framework is a compiler. It compiles RPython code down to C (or other targets), automatically adding in aspects such as garbage collection and a JIT compiler. It cannot handle arbitrary Python code, only RPython.\nRPython is a subset of normal Python; all RPython code is Python code, but not the other way around. There is no formal definition of RPython, because RPython is basically just \"the subset of Python that can be translated by PyPy's translation framework\". But in order to be translated, RPython code has to be statically typed (the types are inferred, you don't declare them, but it's still strictly one type per variable), and you can't do things like declaring/modifying functions/classes at runtime either.\nThe interpreter then is a normal Python interpreter written in RPython.\nBecause RPython code is normal Python code, you can run it on any Python interpreter. But none of PyPy's speed claims come from running it that way; this is just for a rapid test cycle, because translating the interpreter takes a long time.\nWith that understood, it should be immediately obvious that speculations about PyPyPy or PyPyPyPy don't actually make any sense. You have an interpreter written in RPython. You translate it to C code that executes Python quickly. There the process stops; there's no more RPython to speed up by processing it again.\nSo \"How is it possible for PyPy to be faster than CPython\" also becomes fairly obvious. PyPy has a better implementation, including a JIT compiler (it's generally not quite as fast without the JIT compiler, I believe, which means PyPy is only faster for programs susceptible to JIT-compilation). CPython was never designed to be a highly optimising implementation of the Python language (though they do try to make it a highly optimised implementation, if you follow the difference).\n\nThe really innovative bit of the PyPy project is that they don't write sophisticated GC schemes or JIT compilers by hand. They write the interpreter relatively straightforwardly in RPython, and for all RPython is lower level than Python it's still an object-oriented garbage collected language, much more high level than C. Then the translation framework automatically adds things like GC and JIT. So the translation framework is a huge effort, but it applies equally well to the PyPy python interpreter however they change their implementation, allowing for much more freedom in experimentation to improve performance (without worrying about introducing GC bugs or updating the JIT compiler to cope with the changes). It also means when they get around to implementing a Python3 interpreter, it will automatically get the same benefits. And any other interpreters written with the PyPy framework (of which there are a number at varying stages of polish). And all interpreters using the PyPy framework automatically support all platforms supported by the framework.\nSo the true benefit of the PyPy project is to separate out (as much as possible) all the parts of implementing an efficient platform-independent interpreter for a dynamic language. And then come up with one good implementation of them in one place, that can be re-used across many interpreters. That's not an immediate win like \"my Python program runs faster now\", but it's a great prospect for the future.\nAnd it can run your Python program faster (maybe).\n"} {"input": "What does the 'b' character do in front of a string literal? Apparently, the following is valid syntax...\nmy_string = b'The string'\n\nI would like to know...\n\nWhat does this b character infront of the string mean?\nWhat are the effects of using it?\nWhat are appropriate situations to use it.\n\nI found a related question right here on SO but that question is about PHP though and it states the b is used to indicate the string is binary as opposed to unicode which was needed for code to be compatible from version of PHP < 6 when migrating to PHP 6. I don't think this applies to Python.\nI did find this documentation on the python site about using a u character in the same syntax to specify a string as unicode. Unfortunately it doesn't mention the b character anywhere in that document.\nAlso, just out of curiosity, are there more symbols than the b and u that do other things?\n", "output": "Python 3.x makes a clear distinction between the types:\n\nstr = '...' literals = a sequence of Unicode characters (UTF-16 or UTF-32, depending on how Python was compiled)\nbytes = b'...' literals = a sequence of octets (integers between 0 and 255)\n\nIf you're familiar with Java or C#, think of str as String and bytes as byte[]. If you're familiar with SQL, think of str as NVARCHAR and bytes as BINARY or BLOB. If you're familiar with the Windows registry, think of str as REG_SZ and bytes as REG_BINARY. If you're familiar with C(++), then forget everything you've learned about char and strings, because A CHARACTER IS NOT A BYTE. That idea is long obsolete.\nYou use str when you want to represent text.\nprint('\u00d7\u00a9\u00d7\u009c\u00d7\u0095\u00d7\u009d \u00d7\u00a2\u00d7\u0095\u00d7\u009c\u00d7\u009d')\n\nYou use bytes when you want to represent low-level binary data like structs.\nNaN = struct.unpack('>d', b'\\xff\\xf8\\x00\\x00\\x00\\x00\\x00\\x00')[0]\n\nYou can encode a str to a bytes object.\n>>> '\\uFEFF'.encode('UTF-8')\nb'\\xef\\xbb\\xbf'\n\nAnd you can decode a bytes into a str.\n>>> b'\\xE2\\x82\\xAC'.decode('UTF-8')\n'\u00e2\u0082\u00ac'\n\nBut you can't freely mix the two types.\n>>> b'\\xEF\\xBB\\xBF' + 'Text with a UTF-8 BOM'\nTraceback (most recent call last):\n File \"\", line 1, in \nTypeError: can't concat bytes to str\n\nThe b'...' notation is somewhat confusing in that it allows the bytes 0x01-0x7F to be specified with ASCII characters instead of hex numbers.\n>>> b'A' == b'\\x41'\nTrue\n\nBut I must emphasize, a character is not a byte.\n>>> 'A' == b'A'\nFalse\n\nIn Python 2.x\nPre-3.0 versions of Python lacked this kind of distinction between text and binary data. Instead, there was:\n\nunicode = u'...' literals = sequence of Unicode characters = 3.x str\nstr = '...' literals = sequences of confounded bytes/characters\n\n\nUsually text, encoded in some unspecified encoding.\nBut also used to represent binary data like struct.pack output.\n\n\nIn order to ease the 2.x-to-3.x transition, the b'...' literal syntax was backported to Python 2.6, in order to allow distinguishing binary strings (which should be bytes in 3.x) from text strings (which should be str in 3.x). The b prefix does nothing in 2.x, but tells the 2to3 script not to convert it to a Unicode string in 3.x.\nSo yes, b'...' literals in Python have the same purpose that they do in PHP.\n\nAlso, just out of curiosity, are there\n more symbols than the b and u that do\n other things?\n\nThe r prefix creates a raw string (e.g., r'\\t' is a backslash + t instead of a tab), and triple quotes '''...''' or \"\"\"...\"\"\" allow multi-line string literals.\n"} {"input": "add one row in a pandas.DataFrame I understand that pandas is designed to load fully populated DataFrame but I need to create an empty DataFrame then add rows, one by one.\nWhat is the best way to do this ?\nI successfully created an empty DataFrame with :\nres = DataFrame(columns=('lib', 'qty1', 'qty2'))\n\nThen I can add a new row and fill a field with :\nres = res.set_value(len(res), 'qty1', 10.0)\n\nIt works but seems very odd :-/ (it fails for adding string value)\nHow can I add a new row to my DataFrame (with different columns type) ?\n", "output": "You could use pandas.concat() or DataFrame.append(). For details and examples, see Merge, join, and concatenate.\n"} {"input": "overloading __init__ in python Let's say I have a class that has a member called data which is a list. \nI want to be able to initialize the class with, for example, a filename (which contains data to initialize the list) or with an actual list.\nWhat's your technique for doing this?\nDo you just check the type by looking at __class__?\nIs there some trick I might be missing?\nI'm used to C++ where overloading by argument type is easy. \nThanks.\n", "output": "A much neater way to get 'alternate constructors' is to use classmethods. For instance:\n>>> class MyData:\n... def __init__(self, data):\n... \"Initialize MyData from a sequence\"\n... self.data = data\n... \n... @classmethod\n... def fromfilename(cls, filename):\n... \"Initialize MyData from a file\"\n... data = open(filename).readlines()\n... return cls(data)\n... \n... @classmethod\n... def fromdict(cls, datadict):\n... \"Initialize MyData from a dict's items\"\n... return cls(datadict.items())\n... \n>>> MyData([1, 2, 3]).data\n[1, 2, 3]\n>>> MyData.fromfilename(\"/tmp/foobar\").data\n['foo\\n', 'bar\\n', 'baz\\n']\n>>> MyData.fromdict({\"spam\": \"ham\"}).data\n[('spam', 'ham')]\n\nThe reason it's neater is that there is no doubt about what type is expected, and you aren't forced to guess at what the caller intended for you to do with the datatype it gave you. The problem with isinstance(x, basestring) is that there is no way for the caller to tell you, for instance, that even though the type is not a basestring, you should treat it as a string (and not another sequence.) And perhaps the caller would like to use the same type for different purposes, sometimes as a single item, and sometimes as a sequence of items. Being explicit takes all doubt away and leads to more robust and clearer code.\n"} {"input": "python ImportError No module named I am very new at Python and I am getting this error:\nTraceback (most recent call last):\n File \"mountain.py\", line 28, in ?\n from toolkit.interface import interface\nImportError: No module named toolkit.interface\n\nPython is installed in a local directory:\nMy directory tree is like this:\n(local directory)/site-packages/toolkit/interface.py\n\nMy code is in here \n(local directory)/site-packages/toolkit/examples/mountain.py\n\nTo run the example I do python mountain.py, and in the code I have:\nfrom toolkit.interface import interface\n\nAnd i get the error that I wrote, I have already checked sys.path and there I have the directory /site-packages, also I have the file __init__.py.bin in the toolkit folder to indicate to Python that this is a package. I also have a __init__.py.bin in the examples directory.\nI do not know why Python cannot find the file when it is in sys.path, any ideas? Can be a permissions problem? Do I need execution permission?\n", "output": "Based on your comments to orip's post, I guess this is what happened:\n\nYou edited __init__.py on windows.\nThe windows editor added something non-printing, perhaps a carriage-return (end-of-line in Windows is CR/LF; in unix it is LF only), or perhaps a CTRL-Z (windows end-of-file).\nYou used WinSCP to copy the file to your unix box.\nWinSCP thought: \"This has something that's not basic text; I'll put a .bin extension to indicate binary data.\"\nThe missing __init__.py (now called __init__.py.bin) means python doesn't understand toolkit as a package.\nYou create __init__.py in the appropriate directory and everything works... ?\n\n"} {"input": "How to delete a character from a string using python? There is a string, for example EXAMPLE\nHow can I remove the middle character i.e. M from it. I don't need the code, what I want to know is\n\nDo strings in python end in any special character?\nWhich is a better way - shifting everything right to left starting from the middle character OR creation of a new string and not copying the middle character?\n\n", "output": "In Python, strings are immutable, so you have to create a new string. You have a few options of how to create the new string. If you want to remove the 'M' wherever it appears:\nnewstr = oldstr.replace(\"M\", \"\")\n\nIf you want to remove the central character:\nmidlen = len(oldstr)/2\nnewstr = oldstr[:midlen] + oldstr[midlen+1:]\n\nYou asked if strings end with a special character. No, you are thinking like a C programmer. In Python, strings are stored with their length, so any byte value, including \\0, can appear in a string.\n"} {"input": "What's the difference between raw_input() and input() in python3.x? What is the difference between raw_input() and input() in python3.x ?\n", "output": "The difference is that raw_input() does not exist in Python 3.x, while input() does. Actually, the old raw_input() has been renamed to input(), and the old input() is gone (but can easily be simulated by using eval(input())).\n"} {"input": "How to fix python indentation I have some python code that have inconsistent indentation, there is a lot of mixture of tabs and spaces to make the matter even worse even space indentation is not preserved.\nThe code works as expected but it's difficult to maintain.\nHow can I fix the indentation (like \"html tidy\" but for python) without breaking the code?\n", "output": "Use the reindent.py script that you find in the Tools/scripts/ directory of your Python installation:\n\nChange Python (.py) files to use\n 4-space indents and no hard tab\n characters. Also trim excess spaces\n and tabs from ends of lines, and\n remove empty lines at the end of\n files. Also ensure the last line ends\n with a newline.\n\nHave a look at that script for detailed usage instructions.\n"} {"input": "How do you create a daemon in Python? Searching on Google reveals x2 code snippets. The first result is to this code recipe which has a lot of documentation and explanation, along with some useful discussion underneath. \nHowever, another code sample, whilst not containing so much documentation, includes sample code for passing commands such as start, stop and restart. It also creates a PID file which can be handy for checking if the daemon is already running etc.\nThese samples both explain how to create the daemon. Are there any additional things that need to be considered? Is one sample better than the other, and why?\n", "output": "There are many fiddly things to take care of when becoming a well-behaved daemon process:\n\nprevent core dumps (many daemons run as root, and core dumps can contain sensitive information)\nbehave correctly inside a chroot gaol\nset UID, GID, working directory, umask, and other process parameters appropriately for the use case\nrelinquish elevated suid, sgid privileges\nclose all open file descriptors, with exclusions depending on the use case\nbehave correctly if started inside an already-detached context, such as init, inetd, etc.\nset up signal handlers for sensible daemon behaviour, but also with specific handlers determined by the use case\nredirect the standard streams stdin, stdout, stderr since a daemon process no longer has a controlling terminal\nhandle a PID file as a cooperative advisory lock, which is a whole can of worms in itself with many contradictory but valid ways to behave\nallow proper cleanup when the process is terminated\nactually become a daemon process without leading to zombies\n\nSome of these are standard, as described in canonical Unix literature (Advanced Programming in the UNIX Environment, by the late W. Richard Stevens, Addison-Wesley, 1992). Others, such as stream redirection and PID file handling, are conventional behaviour most daemon users would expect but that are less standardised.\nAll of these are covered by the PEP 3143 \u00e2\u0080\u009cStandard daemon process library\u00e2\u0080\u009d specification. The python-daemon reference implementation works on Python 2.7 or later, and Python 3.2 or later.\n"} {"input": "Wrapping a C library in Python: C, Cython or ctypes? I want to call a C library from a Python application. I don't want to wrap the whole API, only the functions and datatypes that are relevant to my case. As I see it, I have three choices:\n\nCreate an actual extension module in C. Probably overkill, and I'd also like to avoid the overhead of learning extension writing.\nUse Cython to expose the relevant parts from the C library to Python.\nDo the whole thing in Python, using ctypes to communicate with the external library.\n\nI'm not sure whether 2) or 3) is the better choice. The advantage of 3) is that ctypes is part of the standard library, and the resulting code would be pure Python \u2013 although I'm not sure how big that advantage actually is.\nAre there more advantages / disadvantages with either choice? Which approach do you recommend?\n\nEdit: Thanks for all your answers, they provide a good resource for anyone looking to do something similar. The decision, of course, is still to be made for the single case\u2014there's no one \"This is the right thing\" sort of answer. For my own case, I'll probably go with ctypes, but I'm also looking forward to trying out Cython in some other project.\nWith there being no single true answer, accepting one is somewhat arbitrary; I chose FogleBird's answer as it provides some good insight into ctypes and it currently also is the highest-voted answer. However, I suggest to read all the answers to get a good overview.\nThanks again.\n", "output": "Warning: a Cython core developer's opinion ahead.\nI almost always recommend Cython over ctypes. The reason is that it has a much smoother upgrade path. If you use ctypes, many things will be simple at first, and it's certainly cool to write your FFI code in plain Python, without compilation, build dependencies and all that. However, at some point, you will almost certainly find that you have to call into your C library a lot, either in a loop or in a longer series of interdependent calls, and you would like to speed that up. That's the point where you'll notice that you can't do that with ctypes. Or, when you need callback functions and you find that your Python callback code becomes a bottleneck, you'd like to speed it up and/or move it down into C as well. Again, you cannot do that with ctypes. So you have to switch languages at that point and start rewriting parts of your code, potentially reverse engineering your Python/ctypes code into plain C, thus spoiling the whole benefit of writing your code in plain Python in the first place.\nWith Cython, OTOH, you're completely free to make the wrapping and calling code as thin or thick as you want. You can start with simple calls into your C code from regular Python code, and Cython will translate them into native C calls, without any additional calling overhead, and with an extremely low conversion overhead for Python parameters. When you notice that you need even more performance at some point where you are making too many expensive calls into your C library, you can start annotating your surrounding Python code with static types and let Cython optimise it straight down into C for you. Or, you can start rewriting parts of your C code in Cython in order to avoid calls and to specialise and tighten your loops algorithmically. And if you need a fast callback, just write a function with the appropriate signature and pass it into the C callback registry directly. Again, no overhead, and it gives you plain C calling performance. And in the much less likely case that you really cannot get your code fast enough in Cython, you can still consider rewriting the truly critical parts of it in C (or C++ or Fortran) and call it from your Cython code naturally and natively. But then, this really becomes the last resort instead of the only option.\nSo, ctypes is nice to do simple things and to quickly get something running. However, as soon as things start to grow, you'll most likely come to the point where you notice that you'd better used Cython right from the start.\n"} {"input": "Get exception description and stack trace which caused an exception, all as a string I've seen a lot of posts about stack trace and exceptions in Python. But haven't found what I need.\nI have a chunk of Python 2.7 code that may raise an exception. I would like to catch it and assign to a string its full description and the stack trace that caused the error (simply all we use to see on the console). I need this string to print it to a text box in the GUI.\nSomething like this:\ntry:\n method_that_can_raise_an_exception(params)\nexcept Exception, e:\n print_to_textbox(complete_exception_description(e))\n\nThe problem is: what is the function complete_exception_description?\n", "output": "See the traceback module, specifically the format_exc() function. Here.\nimport traceback\n\ntry:\n raise ValueError\nexcept:\n tb = traceback.format_exc()\nelse:\n tb = \"No error\"\nfinally:\n print tb\n\n"} {"input": "Python int to binary? Are there any canned Python methods to convert an Integer (or Long) into a binary string in Python?\nThere are a myriad of dec2bin() functions out on Google... But I was hoping I could use a built-in function / library.\n", "output": "Python's string format method can take a format spec. \n>>> \"{0:b}\".format(10)\n'1010'\n\nFormat spec docs for Python 2 \nFormat spec docs for Python 3\n"} {"input": "Is it possible to break a long line to multiple lines in Python Just like C, you can break a long line into multiple short lines. But in Python, if I do this, there will be an indent error... Is it possible?\n", "output": "From PEP 8 - Style Guide for Python Code:\n\nThe preferred way of wrapping long lines is by using Python's implied line\n continuation inside parentheses, brackets and braces. If necessary, you\n can add an extra pair of parentheses around an expression, but sometimes\n using a backslash looks better. Make sure to indent the continued line\n appropriately. The preferred place to break around a binary operator is\n after the operator, not before it.\n\nExample of implicit line continuation:\na = some_function(\n '1' + '2' + '3' + '4')\n\nExample of explicit line continuation:\na = '1' + '2' + \\\n '3' + '4'\n\n"} {"input": "Why does python use 'else' after for and while loops? I understand how this construct works:\nfor i in range(10):\n print(i)\n\n if i == 9:\n print(\"Too big - I'm giving up!\")\n break;\nelse:\n print(\"Completed successfully\")\n\nBut I don't understand why else is used as the keyword here, since it suggests the code in question only runs if the for block does not complete, which is the opposite of what it does! No matter how I think about it, my brain can't progress seamlessly from the for statement to the else block. To me, continue or continuewith would make more sense (and I'm trying to train myself to read it as such).\nI'm wondering how Python coders read this construct in their head (or aloud, if you like). Perhaps I'm missing something that would make such code blocks more easily decipherable?\n", "output": "A common construct is to run a loop until something is found and then to break out of the loop. The problem is that if I break out of the loop or the loop ends I need to determine which case happened. One method is to create a flag or store variable that will let me do a second test to see how the loop was exited.\nFor example assume that I need to search through a list and process each item until a flag item is found and then stop processing. If the flag item is missing then an exception needs to be raised.\nUsing the Python for...else construct you have\n...\n\nfor i in mylist:\n if i == theflag:\n break\n process(i)\nelse:\n raise ValueError(\"List argument missing terminal flag.)\n\n...\n\nCompare this to a method that does not use this syntactic sugar:\n...\n\nflagfound = False\nfor i in mylist:\n if i == theflag:\n flagfound = True\n break\n process(i)\n\nif not flagfound:\n raise ValueError(\"List argument missing terminal flag.\")\n\n...\n\nIn the first case the raise is bound tightly to the for loop it works with. In the second the binding is not as strong and errors may be introduced during maintenance.\n"} {"input": "Why use argparse rather than optparse? I noticed that the Python 2.7 documentation includes yet another command-line parsing module. In addition to getopt and optparse we now have argparse.\nWhy has yet another command-line parsing module been created? Why should I use it instead of optparse? Are there new features that I should know about?\n", "output": "As of 2.7, optparse is deprecated, and will hopefully go away in the future.\nargparse is better for all the reasons listed on its original page (http://code.google.com/p/argparse/):\n\nhandling positional arguments\nsupporting sub-commands\nallowing alternative option prefixes like + and /\nhandling zero-or-more and one-or-more style arguments\nproducing more informative usage messages\nproviding a much simpler interface for custom types and actions\n\nMore information is also in PEP 389, which is the vehicle by which argparse made it into the standard library.\n"} {"input": "python: how to import the class within the same directory or sub directory I have a directory that stores all the .py files.\nbin/\n main.py\n user.py # where class User resides\n dir.py # where class Dir resides\n\nI want to use classes from user.py and dir.py in main.py.\nHow can I import these Python classes into main.py?\nFurthermore, how can I import class User if user.py is in a sub directory?\nbin/\n dir.py\n main.py\n usr/\n user.py\n\n", "output": "Python 2\nMake an empty file called __init__.py in the same directory as the files. That will signify to Python that it's \"ok to import from this directory\".\nThen just do...\nfrom user import User\nfrom dir import Dir\n\nThe same holds true if the files are in a subdirectory - put an __init__.py in the subdirectory as well, and then use regular import statements, with dot notation. For each level of directory, you need to add to the import path. \nbin/\n main.py\n classes/\n user.py\n dir.py\n\nSo if the directory was named \"classes\", then you'd do this:\nfrom classes.user import User\nfrom classes.dir import Dir\n\nPython 3\nSame as previous, but prefix the module name with a . if not using a subdirectory:\nfrom .user import User\nfrom .dir import Dir\n\n"} {"input": "How can I check the syntax of Python script without executing it? I used to use perl -c programfile to check the syntax of a Perl program and then exit without executing it. Is there an equivalent way to do this for a Python script?\n", "output": "You can check the syntax by compiling it:\npython -m py_compile script.py\n\n"} {"input": "Python in Xcode 7 How does one create a Python friendly environment in Xcode 7?\n", "output": "I figured it out! The steps make it look like it will take more effort than it actually does.\nThese instructions are for creating a project from scratch. If you have existing Python scripts that you wish to include in this project, you will obviously need to slightly deviate from these instructions.\nIf you find that these instructions no longer work or are unclear due to changes in Xcode updates, please let me know. I will make the necessary corrections.\n\nOpen Xcode. The instructions for either are the same.\nIn the menu bar, click \u00e2\u0080\u009cFile\u00e2\u0080\u009d \u00e2\u0086\u0092 \u00e2\u0080\u009cNew\u00e2\u0080\u009d \u00e2\u0086\u0092 \u00e2\u0080\u009cNew Project\u00e2\u0080\u00a6\u00e2\u0080\u009d.\nSelect \u00e2\u0080\u009cOther\u00e2\u0080\u009d in the left pane, then \"External Build System\" in the right page, and next click \"Next\".\nEnter the product name, organization name, or organization identifier.\nFor the \u00e2\u0080\u009cBuild Tool\u00e2\u0080\u009d field, type in /usr/local/bin/python3 for Python 3 or /usr/bin/python for Python 2 and then click \u00e2\u0080\u009cNext\u00e2\u0080\u009d. Note that this assumes you have the symbolic link (that is setup by default) that resolves to the Python executable. If you are unsure as to where your Python executables are, enter either of these commands into Terminal: which python3 and which python.\nClick \u00e2\u0080\u009cNext\u00e2\u0080\u009d.\nChoose where to save it and click \u00e2\u0080\u009cCreate\u00e2\u0080\u009d.\nIn the menu bar, click \u00e2\u0080\u009cFile\u00e2\u0080\u009d \u00e2\u0086\u0092 \u00e2\u0080\u009cNew\u00e2\u0080\u009d \u00e2\u0086\u0092 \u00e2\u0080\u009cNew File\u00e2\u0080\u00a6\u00e2\u0080\u009d.\nSelect \u00e2\u0080\u009cOther\u00e2\u0080\u009d under \u00e2\u0080\u009cOS X\u00e2\u0080\u009d.\nSelect \u00e2\u0080\u009cEmpty\u00e2\u0080\u009d and click \u00e2\u0080\u009cNext\u00e2\u0080\u009d.\nNavigate to the project folder (it will not work, otherwise), enter the name of the Python file (including the \u00e2\u0080\u009c.py\u00e2\u0080\u009d extension), and click \u00e2\u0080\u009cCreate\u00e2\u0080\u009d.\nIn the menu bar, click \u00e2\u0080\u009cProduct\u00e2\u0080\u009d \u00e2\u0086\u0092 \u00e2\u0080\u009cScheme\u00e2\u0080\u009d \u00e2\u0086\u0092 \u00e2\u0080\u009cEdit Scheme\u00e2\u0080\u00a6\u00e2\u0080\u009d.\nClick \u00e2\u0080\u009cRun\u00e2\u0080\u009d in the left pane.\nIn the \u00e2\u0080\u009cInfo\u00e2\u0080\u009d tab, click the \u00e2\u0080\u009cExecutable\u00e2\u0080\u009d field and then click \u00e2\u0080\u009cOther\u00e2\u0080\u00a6\u00e2\u0080\u009d.\nNavigate to the executable from Step 6. You may need to use \u00e2\u0087\u00a7\u00e2\u008c\u0098G to type in the directory if it is hidden.\nSelect the executable and click \"Choose\".\nUncheck \u00e2\u0080\u009cDebug executable\u00e2\u0080\u009d. If you skip this step, Xcode will try to debug the Python executable itself. I am unaware of a way to integrate an external debugging tool into Xcode.\nClick the \u00e2\u0080\u009c+\u00e2\u0080\u009d icon under \u00e2\u0080\u009cArguments Passed On Launch\u00e2\u0080\u009d. You might have to expand that section by clicking on the triangle pointing to the right.\nType in $(SRCROOT)/ (or $(SOURCE_ROOT)/) and then the name of the Python file you want to test. Remember, the Python program must be in the project folder. Otherwise, you will have to type out the full path (or relative path if it's in a subfolder of the project folder) here. If there are spaces anywhere in the full path, you must include quotation marks at the beginning and end of this.\nClick \u00e2\u0080\u009cClose\u00e2\u0080\u009d.\n\nNote that if you open the \"Utilities\" panel, with the \"Show the File inspector\" tab active, the file type is automatically set to \"Default - Python script\". Feel free to look through all the file type options it has, to gain an idea as to what all it is capable of doing. The method above can be applied to any interpreted language. As of right now, I have yet to figure out exactly how to get it to work with Java; then again, I haven't done too much research. Surely there is some documentation floating around on the web about all of this.\nRunning without administrative privileges:\nIf you do not have administrative privileges or are not in the Developer group, you can still use Xcode for Python programming (but you still won't be able to develop in languages that require compiling). Instead of using the play button, in the menu bar, click \"Product\" \u00e2\u0086\u0092 \"Perform Action\" \u00e2\u0086\u0092 \"Run Without Building\" or simply use the keyboard shortcut ^\u00e2\u008c\u0098R.\nOther Notes:\nTo change the text encoding, line endings, and/or indentation settings, open the \"Utilities\" panel and click \"Show the File inspector\" tab active. There, you will find these settings.\nFor more information about Xcode's build settings, there is no better source than this. I'd be interested in hearing from somebody who got this to work with unsupported compiled languages. This process should work for any other interpreted language. Just be sure to change Step 6 and Step 16 accordingly.\n"} {"input": "Test if executable exists in Python? In Python, is there a portable and simple way to test if an executable program exists?\nBy simple I mean something like the which command which would be just perfect. I don't want to search PATH manually or something involving trying to execute it with Popen & al and see if it fails (that's what I'm doing now, but imagine it's launchmissiles)\n", "output": "Easiest way I can think of: \ndef which(program):\n import os\n def is_exe(fpath):\n return os.path.isfile(fpath) and os.access(fpath, os.X_OK)\n\n fpath, fname = os.path.split(program)\n if fpath:\n if is_exe(program):\n return program\n else:\n for path in os.environ[\"PATH\"].split(os.pathsep):\n path = path.strip('\"')\n exe_file = os.path.join(path, program)\n if is_exe(exe_file):\n return exe_file\n\n return None\n\nEdit: Updated code sample to include logic for handling case where provided argument is already a full path to the executable, i.e. \"which /bin/ls\". This mimics the behavior of the UNIX 'which' command.\nEdit: Updated to use os.path.isfile() instead of os.path.exists() per comments.\n"} {"input": "How would you make a comma-separated string from a list? What would be your preferred way to concatenate strings from a sequence such that between each two consecutive pair a comma is added. That is, how do you map, for instance, [ 'a', 'b', 'c' ] to 'a,b,c'? (The cases [ s ] and [] should be mapped to s and '', respectively.)\nI usually end up using something like ''.join(map(lambda x: x+',',l))[:-1], but also feeling somewhat unsatisfied.\nEdit: I'm both ashamed and happy that the solution is so simple. Obviously I have hardly a clue as to what I'm doing. (I probably needed \"simple\" concatenation in the past and somehow memorised s.join([e1,e2,...]) as a shorthand for s+e1+e2+....)\n", "output": "myList = ['a','b','c','d']\nmyString = \",\".join(myList )\n\nThis won't work if the list contains numbers.\n\nAs Ricardo Reyes suggested, if it contains non-string types (such as integers, floats, bools, None) then do:\nmyList = ','.join(map(str, myList)) \n\n"} {"input": "TypeError: 'module' object is not callable File \"C:\\Users\\Administrator\\Documents\\Mibot\\oops\\blinkserv.py\", line 82, in __init__\n self.serv = socket(AF_INET,SOCK_STREAM)\nTypeError: 'module' object is not callable\n\nWhy am I getting this error?\nI'm confused.\nWhat do you need to know to answer my question?\n", "output": "socket is a module, containing the class socket.\nYou need to do socket.socket(...) or from socket import socket:\n>>> import socket\n>>> socket\n\n>>> socket.socket\n\n>>>\n>>> from socket import socket\n>>> socket\n\n\nNotice that the error here is pretty easy to understand: if module object is not callable, you are probably calling a module object! What's a module object? It's the type of thing you get when you import a module.\nBut even if you don't understand that (which is fine, it's often confusing that classes, modules, functions etc are all just objects like everything else), there's a pretty easy way to start debugging:\n\n\"Hmm, module object is not callable. That sounds like I'm trying to call something that you can't call. I wonder what I'm trying to call?\"\n\"Oh, I'm calling socket. That should be callable! I wonder if the variable socket is what I think it is?`\nprint socket\n\n"} {"input": "Python thread pool similar to the multiprocessing Pool? Is there a Pool class for worker threads, similar to the multiprocessing module's Pool class?\nI like for example the easy way to parallelize a map function\ndef long_running_func(p):\n c_func_no_gil(p)\n\np = multiprocessing.Pool(4)\nxs = p.map(long_running_func, range(100))\n\nhowever I would like to do it without the overhead of creating new processes.\nI know about the GIL. However, in my usecase, the function will be an IO-bound C function for which the python wrapper will release the GIL before the actual function call.\nDo I have to write my own threading pool?\n", "output": "I just found out that there actually is a thread-based Pool interface in the multiprocessing module, however it is hidden somewhat and not properly documented.\nIt can be imported via\nfrom multiprocessing.pool import ThreadPool\n\nIt is implemented using a dummy Process class wrapping a python thread. This thread-based Process class can be found in multiprocessing.dummy which is mentioned briefly in the docs. This dummy module supposedly provides the whole multiprocessing interface based on threads.\n"} {"input": "Python: import a file from a subdirectory I have a file called tester.py, located on /project.\n/project has a subdirectory called lib, with a file called BoxTime.py:\n/project/tester.py\n/project/lib/BoxTime.py\n\nI want to import BoxTime from tester. I have tried this:\nimport lib.BoxTime\n\nWhich resulted:\nTraceback (most recent call last):\n File \"./tester.py\", line 3, in \n import lib.BoxTime\nImportError: No module named lib.BoxTime\n\nAny ideas how to import BoxTime from the subdirectory?\nEDIT\nThe __init__.py was the problem, but don't forget to refer to BoxTime as lib.BoxTime, or use:\nimport lib.BoxTime as BT\n...\nBT.bt_function()\n\n", "output": "Take a look at the Packages documentation (Section 6.4) here: http://docs.python.org/tutorial/modules.html\nIn short, you need to put a blank file named \n__init__.py\n\nin the \"lib\" directory.\n"} {"input": "Can a variable number of arguments be passed to a function? In a similar way to using varargs in C or C++:\nfn(a, b)\nfn(a, b, c, d, ...)\n\n", "output": "Yes.\nThis is simple and works if you disregard keyword arguments:\ndef manyArgs(*arg):\n print \"I was called with\", len(arg), \"arguments:\", arg\n\n>>> manyArgs(1)\nI was called with 1 arguments: (1,)\n>>> manyArgs(1, 2,3)\nI was called with 3 arguments: (1, 2, 3)\n\nAs you can see, Python will give you a single tuple with all the arguments.\nFor keyword arguments you need to accept those as a separate actual argument, as shown in Skurmedel's answer.\n"} {"input": "Getting a list of all subdirectories in the current directory Is there a way to return a list of all the subdirectories in the current directory in Python?\nI know you can do this with files, but I need to get the list of directories instead.\n", "output": "Do you mean immediate subdirectories, or every directory right down the tree? \nEither way, you could use os.walk to do this:\nos.walk(directory)\n\nwill yield a tuple for each subdirectory. Ths first entry in the 3-tuple is a directory name, so\n[x[0] for x in os.walk(directory)]\n\nshould give you all of the directories.\nNote that the second entry in the tuple is the list of child directories of the entry in the first position, so you could use this instead, but it's not likely to save you much.\nHowever, you could use it just to give you the immediate child directories:\nnext(os.walk('.'))[1]\n\nOr see the other solutions already posted, using os.listdir and os.path.isdir, including those at \"get all of the immediate subdirectories in python\".\n"} {"input": "Python - Pytz - List of Timezones? I would like to know what are all the possible values for the timezone argument in the Python library Pytz.\nSOLUTION\nfor tz in pytz.all_timezones:\n print tz\n\n\nAfrica/Abidjan\nAfrica/Accra\nAfrica/Addis_Ababa\nAfrica/Algiers\nAfrica/Asmara\nAfrica/Asmera\nAfrica/Bamako\nAfrica/Bangui\nAfrica/Banjul\nAfrica/Bissau\nAfrica/Blantyre\nAfrica/Brazzaville\nAfrica/Bujumbura\nAfrica/Cairo\nAfrica/Casablanca\nAfrica/Ceuta\nAfrica/Conakry\nAfrica/Dakar\nAfrica/Dar_es_Salaam\nAfrica/Djibouti\nAfrica/Douala\nAfrica/El_Aaiun\nAfrica/Freetown\nAfrica/Gaborone\nAfrica/Harare\nAfrica/Johannesburg\nAfrica/Juba\nAfrica/Kampala\nAfrica/Khartoum\nAfrica/Kigali\nAfrica/Kinshasa\nAfrica/Lagos\nAfrica/Libreville\nAfrica/Lome\nAfrica/Luanda\nAfrica/Lubumbashi\nAfrica/Lusaka\nAfrica/Malabo\nAfrica/Maputo\nAfrica/Maseru\nAfrica/Mbabane\nAfrica/Mogadishu\nAfrica/Monrovia\nAfrica/Nairobi\nAfrica/Ndjamena\nAfrica/Niamey\nAfrica/Nouakchott\nAfrica/Ouagadougou\nAfrica/Porto-Novo\nAfrica/Sao_Tome\nAfrica/Timbuktu\nAfrica/Tripoli\nAfrica/Tunis\nAfrica/Windhoek\nAmerica/Adak\nAmerica/Anchorage\nAmerica/Anguilla\nAmerica/Antigua\nAmerica/Araguaina\nAmerica/Argentina/Buenos_Aires\nAmerica/Argentina/Catamarca\nAmerica/Argentina/ComodRivadavia\nAmerica/Argentina/Cordoba\nAmerica/Argentina/Jujuy\nAmerica/Argentina/La_Rioja\nAmerica/Argentina/Mendoza\nAmerica/Argentina/Rio_Gallegos\nAmerica/Argentina/Salta\nAmerica/Argentina/San_Juan\nAmerica/Argentina/San_Luis\nAmerica/Argentina/Tucuman\nAmerica/Argentina/Ushuaia\nAmerica/Aruba\nAmerica/Asuncion\nAmerica/Atikokan\nAmerica/Atka\nAmerica/Bahia\nAmerica/Bahia_Banderas\nAmerica/Barbados\nAmerica/Belem\nAmerica/Belize\nAmerica/Blanc-Sablon\nAmerica/Boa_Vista\nAmerica/Bogota\nAmerica/Boise\nAmerica/Buenos_Aires\nAmerica/Cambridge_Bay\nAmerica/Campo_Grande\nAmerica/Cancun\nAmerica/Caracas\nAmerica/Catamarca\nAmerica/Cayenne\nAmerica/Cayman\nAmerica/Chicago\nAmerica/Chihuahua\nAmerica/Coral_Harbour\nAmerica/Cordoba\nAmerica/Costa_Rica\nAmerica/Creston\nAmerica/Cuiaba\nAmerica/Curacao\nAmerica/Danmarkshavn\nAmerica/Dawson\nAmerica/Dawson_Creek\nAmerica/Denver\nAmerica/Detroit\nAmerica/Dominica\nAmerica/Edmonton\nAmerica/Eirunepe\nAmerica/El_Salvador\nAmerica/Ensenada\nAmerica/Fort_Wayne\nAmerica/Fortaleza\nAmerica/Glace_Bay\nAmerica/Godthab\nAmerica/Goose_Bay\nAmerica/Grand_Turk\nAmerica/Grenada\nAmerica/Guadeloupe\nAmerica/Guatemala\nAmerica/Guayaquil\nAmerica/Guyana\nAmerica/Halifax\nAmerica/Havana\nAmerica/Hermosillo\nAmerica/Indiana/Indianapolis\nAmerica/Indiana/Knox\nAmerica/Indiana/Marengo\nAmerica/Indiana/Petersburg\nAmerica/Indiana/Tell_City\nAmerica/Indiana/Vevay\nAmerica/Indiana/Vincennes\nAmerica/Indiana/Winamac\nAmerica/Indianapolis\nAmerica/Inuvik\nAmerica/Iqaluit\nAmerica/Jamaica\nAmerica/Jujuy\nAmerica/Juneau\nAmerica/Kentucky/Louisville\nAmerica/Kentucky/Monticello\nAmerica/Knox_IN\nAmerica/Kralendijk\nAmerica/La_Paz\nAmerica/Lima\nAmerica/Los_Angeles\nAmerica/Louisville\nAmerica/Lower_Princes\nAmerica/Maceio\nAmerica/Managua\nAmerica/Manaus\nAmerica/Marigot\nAmerica/Martinique\nAmerica/Matamoros\nAmerica/Mazatlan\nAmerica/Mendoza\nAmerica/Menominee\nAmerica/Merida\nAmerica/Metlakatla\nAmerica/Mexico_City\nAmerica/Miquelon\nAmerica/Moncton\nAmerica/Monterrey\nAmerica/Montevideo\nAmerica/Montreal\nAmerica/Montserrat\nAmerica/Nassau\nAmerica/New_York\nAmerica/Nipigon\nAmerica/Nome\nAmerica/Noronha\nAmerica/North_Dakota/Beulah\nAmerica/North_Dakota/Center\nAmerica/North_Dakota/New_Salem\nAmerica/Ojinaga\nAmerica/Panama\nAmerica/Pangnirtung\nAmerica/Paramaribo\nAmerica/Phoenix\nAmerica/Port-au-Prince\nAmerica/Port_of_Spain\nAmerica/Porto_Acre\nAmerica/Porto_Velho\nAmerica/Puerto_Rico\nAmerica/Rainy_River\nAmerica/Rankin_Inlet\nAmerica/Recife\nAmerica/Regina\nAmerica/Resolute\nAmerica/Rio_Branco\nAmerica/Rosario\nAmerica/Santa_Isabel\nAmerica/Santarem\nAmerica/Santiago\nAmerica/Santo_Domingo\nAmerica/Sao_Paulo\nAmerica/Scoresbysund\nAmerica/Shiprock\nAmerica/Sitka\nAmerica/St_Barthelemy\nAmerica/St_Johns\nAmerica/St_Kitts\nAmerica/St_Lucia\nAmerica/St_Thomas\nAmerica/St_Vincent\nAmerica/Swift_Current\nAmerica/Tegucigalpa\nAmerica/Thule\nAmerica/Thunder_Bay\nAmerica/Tijuana\nAmerica/Toronto\nAmerica/Tortola\nAmerica/Vancouver\nAmerica/Virgin\nAmerica/Whitehorse\nAmerica/Winnipeg\nAmerica/Yakutat\nAmerica/Yellowknife\nAntarctica/Casey\nAntarctica/Davis\nAntarctica/DumontDUrville\nAntarctica/Macquarie\nAntarctica/Mawson\nAntarctica/McMurdo\nAntarctica/Palmer\nAntarctica/Rothera\nAntarctica/South_Pole\nAntarctica/Syowa\nAntarctica/Vostok\nArctic/Longyearbyen\nAsia/Aden\nAsia/Almaty\nAsia/Amman\nAsia/Anadyr\nAsia/Aqtau\nAsia/Aqtobe\nAsia/Ashgabat\nAsia/Ashkhabad\nAsia/Baghdad\nAsia/Bahrain\nAsia/Baku\nAsia/Bangkok\nAsia/Beirut\nAsia/Bishkek\nAsia/Brunei\nAsia/Calcutta\nAsia/Choibalsan\nAsia/Chongqing\nAsia/Chungking\nAsia/Colombo\nAsia/Dacca\nAsia/Damascus\nAsia/Dhaka\nAsia/Dili\nAsia/Dubai\nAsia/Dushanbe\nAsia/Gaza\nAsia/Harbin\nAsia/Hebron\nAsia/Ho_Chi_Minh\nAsia/Hong_Kong\nAsia/Hovd\nAsia/Irkutsk\nAsia/Istanbul\nAsia/Jakarta\nAsia/Jayapura\nAsia/Jerusalem\nAsia/Kabul\nAsia/Kamchatka\nAsia/Karachi\nAsia/Kashgar\nAsia/Kathmandu\nAsia/Katmandu\nAsia/Kolkata\nAsia/Krasnoyarsk\nAsia/Kuala_Lumpur\nAsia/Kuching\nAsia/Kuwait\nAsia/Macao\nAsia/Macau\nAsia/Magadan\nAsia/Makassar\nAsia/Manila\nAsia/Muscat\nAsia/Nicosia\nAsia/Novokuznetsk\nAsia/Novosibirsk\nAsia/Omsk\nAsia/Oral\nAsia/Phnom_Penh\nAsia/Pontianak\nAsia/Pyongyang\nAsia/Qatar\nAsia/Qyzylorda\nAsia/Rangoon\nAsia/Riyadh\nAsia/Saigon\nAsia/Sakhalin\nAsia/Samarkand\nAsia/Seoul\nAsia/Shanghai\nAsia/Singapore\nAsia/Taipei\nAsia/Tashkent\nAsia/Tbilisi\nAsia/Tehran\nAsia/Tel_Aviv\nAsia/Thimbu\nAsia/Thimphu\nAsia/Tokyo\nAsia/Ujung_Pandang\nAsia/Ulaanbaatar\nAsia/Ulan_Bator\nAsia/Urumqi\nAsia/Vientiane\nAsia/Vladivostok\nAsia/Yakutsk\nAsia/Yekaterinburg\nAsia/Yerevan\nAtlantic/Azores\nAtlantic/Bermuda\nAtlantic/Canary\nAtlantic/Cape_Verde\nAtlantic/Faeroe\nAtlantic/Faroe\nAtlantic/Jan_Mayen\nAtlantic/Madeira\nAtlantic/Reykjavik\nAtlantic/South_Georgia\nAtlantic/St_Helena\nAtlantic/Stanley\nAustralia/ACT\nAustralia/Adelaide\nAustralia/Brisbane\nAustralia/Broken_Hill\nAustralia/Canberra\nAustralia/Currie\nAustralia/Darwin\nAustralia/Eucla\nAustralia/Hobart\nAustralia/LHI\nAustralia/Lindeman\nAustralia/Lord_Howe\nAustralia/Melbourne\nAustralia/NSW\nAustralia/North\nAustralia/Perth\nAustralia/Queensland\nAustralia/South\nAustralia/Sydney\nAustralia/Tasmania\nAustralia/Victoria\nAustralia/West\nAustralia/Yancowinna\nBrazil/Acre\nBrazil/DeNoronha\nBrazil/East\nBrazil/West\nCET\nCST6CDT\nCanada/Atlantic\nCanada/Central\nCanada/East-Saskatchewan\nCanada/Eastern\nCanada/Mountain\nCanada/Newfoundland\nCanada/Pacific\nCanada/Saskatchewan\nCanada/Yukon\nChile/Continental\nChile/EasterIsland\nCuba\nEET\nEST\nEST5EDT\nEgypt\nEire\nEtc/GMT\nEtc/GMT+0\nEtc/GMT+1\nEtc/GMT+10\nEtc/GMT+11\nEtc/GMT+12\nEtc/GMT+2\nEtc/GMT+3\nEtc/GMT+4\nEtc/GMT+5\nEtc/GMT+6\nEtc/GMT+7\nEtc/GMT+8\nEtc/GMT+9\nEtc/GMT-0\nEtc/GMT-1\nEtc/GMT-10\nEtc/GMT-11\nEtc/GMT-12\nEtc/GMT-13\nEtc/GMT-14\nEtc/GMT-2\nEtc/GMT-3\nEtc/GMT-4\nEtc/GMT-5\nEtc/GMT-6\nEtc/GMT-7\nEtc/GMT-8\nEtc/GMT-9\nEtc/GMT0\nEtc/Greenwich\nEtc/UCT\nEtc/UTC\nEtc/Universal\nEtc/Zulu\nEurope/Amsterdam\nEurope/Andorra\nEurope/Athens\nEurope/Belfast\nEurope/Belgrade\nEurope/Berlin\nEurope/Bratislava\nEurope/Brussels\nEurope/Bucharest\nEurope/Budapest\nEurope/Chisinau\nEurope/Copenhagen\nEurope/Dublin\nEurope/Gibraltar\nEurope/Guernsey\nEurope/Helsinki\nEurope/Isle_of_Man\nEurope/Istanbul\nEurope/Jersey\nEurope/Kaliningrad\nEurope/Kiev\nEurope/Lisbon\nEurope/Ljubljana\nEurope/London\nEurope/Luxembourg\nEurope/Madrid\nEurope/Malta\nEurope/Mariehamn\nEurope/Minsk\nEurope/Monaco\nEurope/Moscow\nEurope/Nicosia\nEurope/Oslo\nEurope/Paris\nEurope/Podgorica\nEurope/Prague\nEurope/Riga\nEurope/Rome\nEurope/Samara\nEurope/San_Marino\nEurope/Sarajevo\nEurope/Simferopol\nEurope/Skopje\nEurope/Sofia\nEurope/Stockholm\nEurope/Tallinn\nEurope/Tirane\nEurope/Tiraspol\nEurope/Uzhgorod\nEurope/Vaduz\nEurope/Vatican\nEurope/Vienna\nEurope/Vilnius\nEurope/Volgograd\nEurope/Warsaw\nEurope/Zagreb\nEurope/Zaporozhye\nEurope/Zurich\nGB\nGB-Eire\nGMT\nGMT+0\nGMT-0\nGMT0\nGreenwich\nHST\nHongkong\nIceland\nIndian/Antananarivo\nIndian/Chagos\nIndian/Christmas\nIndian/Cocos\nIndian/Comoro\nIndian/Kerguelen\nIndian/Mahe\nIndian/Maldives\nIndian/Mauritius\nIndian/Mayotte\nIndian/Reunion\nIran\nIsrael\nJamaica\nJapan\nKwajalein\nLibya\nMET\nMST\nMST7MDT\nMexico/BajaNorte\nMexico/BajaSur\nMexico/General\nNZ\nNZ-CHAT\nNavajo\nPRC\nPST8PDT\nPacific/Apia\nPacific/Auckland\nPacific/Chatham\nPacific/Chuuk\nPacific/Easter\nPacific/Efate\nPacific/Enderbury\nPacific/Fakaofo\nPacific/Fiji\nPacific/Funafuti\nPacific/Galapagos\nPacific/Gambier\nPacific/Guadalcanal\nPacific/Guam\nPacific/Honolulu\nPacific/Johnston\nPacific/Kiritimati\nPacific/Kosrae\nPacific/Kwajalein\nPacific/Majuro\nPacific/Marquesas\nPacific/Midway\nPacific/Nauru\nPacific/Niue\nPacific/Norfolk\nPacific/Noumea\nPacific/Pago_Pago\nPacific/Palau\nPacific/Pitcairn\nPacific/Pohnpei\nPacific/Ponape\nPacific/Port_Moresby\nPacific/Rarotonga\nPacific/Saipan\nPacific/Samoa\nPacific/Tahiti\nPacific/Tarawa\nPacific/Tongatapu\nPacific/Truk\nPacific/Wake\nPacific/Wallis\nPacific/Yap\nPoland\nPortugal\nROC\nROK\nSingapore\nTurkey\nUCT\nUS/Alaska\nUS/Aleutian\nUS/Arizona\nUS/Central\nUS/East-Indiana\nUS/Eastern\nUS/Hawaii\nUS/Indiana-Starke\nUS/Michigan\nUS/Mountain\nUS/Pacific\nUS/Pacific-New\nUS/Samoa\nUTC\nUniversal\nW-SU\nWET\nZulu\n\n", "output": "You can list all the available timezones with pytz.all_timezones:\nIn [40]: import pytz\nIn [41]: pytz.all_timezones\nOut[42]: \n['Africa/Abidjan',\n 'Africa/Accra',\n 'Africa/Addis_Ababa',\n ...]\n\nThere is also pytz.common_timezones:\nIn [45]: len(pytz.common_timezones)\nOut[45]: 403\n\nIn [46]: len(pytz.all_timezones)\nOut[46]: 563\n\n"} {"input": "Python Time Seconds to h:m:s I have a function that returns information in seconds, but I need to store that information in hours:minutes:seconds. Is there an easy way to convert the seconds to this format in python?\n", "output": "or you can do\n>>> import datetime\n>>> str(datetime.timedelta(seconds=666))\n'0:11:06'\n\n"} {"input": "How to manage local vs production settings in Django? \nWhat is the recommended way of handling settings for local development and the production server? Some of them (like constants, etc) can be changed/accessed in both, but some of them (like paths to static files) need to remain different, and hence should not be overwritten every time the new code is deployed.\nCurrently, I am adding all constants to settings.py. But every time I change some constant locally, I have to copy it to the production server and edit the file for production specific changes... :( \nEdit: looks like there is no standard answer to this question, I've accepted the most popular method.\n", "output": "Two Scoops of Django: Best Practices for Django 1.5 suggests using version control for your settings files and storing the files in a separate directory:\nproject/\n app1/\n app2/\n project/\n __init__.py\n settings/\n __init__.py\n base.py\n local.py\n production.py\n manage.py\n\nThe base.py file contains common settings (such as MEDIA_ROOT or ADMIN), while local.py and production.py have site-specific settings:\nIn the base file settings/base.py:\nINSTALLED_APPS = (\n # common apps...\n)\n\nIn the local development settings file settings/local.py:\nfrom project.settings.base import *\n\nDEBUG = True\nINSTALLED_APPS += (\n 'debug_toolbar', # and other apps for local development\n)\n\nIn the file production settings file settings/production.py:\nfrom project.settings.base import *\n\nDEBUG = False\nINSTALLED_APPS += (\n # other apps for production site\n)\n\nThen when you run django, you add the --settings option:\n# Running django for local development\n$ ./manage.py runserver 0:8000 --settings=project.settings.local\n\n# Running django shell on the production site\n$ ./manage.py shell --settings=project.settings.production\n\nThe authors of the book have also put up a sample project layout template on Github.\n"} {"input": "Relative imports in Python 3 I want to import a function from another file in the same directory.\nSometimes it works for me with from .mymodule import myfunction but sometimes I get a \n\nSystemError: Parent module '' not loaded, cannot perform relative import\n\nSometimes it works with from mymodule import myfunction, but sometimes I also get a \n\nSystemError: Parent module '' not loaded, cannot perform relative import\n\nI don't understand the logic here, and I couldn't find any explanation. This looks completely random.\nCould someone explain to me what's the logic behind all this?\n", "output": "\nunfortunately, this module needs to be inside the package, and it also\n needs to be runnable as a script, sometimes. Any idea how I could\n achieve that?\n\nIt's quite common to have a layout like this...\nmain.py\nmypackage/\n __init__.py\n mymodule.py\n myothermodule.py\n\n...with a mymodule.py like this...\n#!/usr/bin/env python3\n\n# Exported function\ndef as_int(a):\n return int(a)\n\n# Test function for module \ndef _test():\n assert as_int('1') == 1\n\nif __name__ == '__main__':\n _test()\n\n...a myothermodule.py like this...\n#!/usr/bin/env python3\n\nfrom .mymodule import as_int\n\n# Exported function\ndef add(a, b):\n return as_int(a) + as_int(b)\n\n# Test function for module \ndef _test():\n assert add('1', '1') == 2\n\nif __name__ == '__main__':\n _test()\n\n...and a main.py like this...\n#!/usr/bin/env python3\n\nfrom mypackage.myothermodule import add\n\ndef main():\n print(add('1', '1'))\n\nif __name__ == '__main__':\n main()\n\n...which works fine when you run main.py or mypackage/mymodule.py, but fails with mypackage/myothermodule.py, due to the relative import...\nfrom .mymodule import as_int\n\nThe way you're supposed to run it is...\npython3 -m mypackage.myothermodule\n\n...but it's somewhat verbose, and doesn't mix well with a shebang line like #!/usr/bin/env python3.\nThe simplest fix for this case, assuming the name mymodule is globally unique, would be to avoid using relative imports, and just use...\nfrom mymodule import as_int\n\n...although, if it's not unique, or your package structure is more complex, you'll need to include the directory containing your package directory in PYTHONPATH, and do it like this...\nfrom mypackage.mymodule import as_int\n\n...or if you want it to work \"out of the box\", you can frob the PYTHONPATH in code first with this...\nimport sys\nimport os\n\nPACKAGE_PARENT = '..'\nSCRIPT_DIR = os.path.dirname(os.path.realpath(os.path.join(os.getcwd(), os.path.expanduser(__file__))))\nsys.path.append(os.path.normpath(os.path.join(SCRIPT_DIR, PACKAGE_PARENT)))\n\nfrom mypackage.mymodule import as_int\n\nIt's kind of a pain, but there's a clue as to why in an email written by a certain Guido van Rossum...\n\nI'm -1 on this and on any other proposed twiddlings of the __main__\n machinery. The only use case seems to be running scripts that happen\n to be living inside a module's directory, which I've always seen as an\n antipattern. To make me change my mind you'd have to convince me that\n it isn't.\n\nWhether running scripts inside a package is an antipattern or not is subjective, but personally I find it really useful in a package I have which contains some custom wxPython widgets, so I can run the script for any of the source files to display a wx.Frame containing only that widget for testing purposes.\n"} {"input": "How can I improve my paw detection? After my previous question on finding toes within each paw, I started loading up other measurements to see how it would hold up. Unfortunately, I quickly ran into a problem with one of the preceding steps: recognizing the paws.\nYou see, my proof of concept basically took the maximal pressure of each sensor over time and would start looking for the sum of each row, until it finds on that != 0.0. Then it does the same for the columns and as soon as it finds more than 2 rows with that are zero again. It stores the minimal and maximal row and column values to some index.\n\nAs you can see in the figure, this works quite well in most cases. However, there are a lot of downsides to this approach (other than being very primitive):\n\nHumans can have 'hollow feet' which means there are several empty rows within the footprint itself. Since I feared this could happen with (large) dogs too, I waited for at least 2 or 3 empty rows before cutting off the paw. \nThis creates a problem if another contact made in a different column before it reaches several empty rows, thus expanding the area. I figure I could compare the columns and see if they exceed a certain value, they must be separate paws.\nThe problem gets worse when the dog is very small or walks at a higher pace. What happens is that the front paw's toes are still making contact, while the hind paw's toes just start to make contact within the same area as the front paw!\nWith my simple script, it won't be able to split these two, because it would have to determine which frames of that area belong to which paw, while currently I would only have to look at the maximal values over all frames.\n\nExamples of where it starts going wrong:\n\n\nSo now I'm looking for a better way of recognizing and separating the paws (after which I'll get to the problem of deciding which paw it is!).\nUpdate:\nI've been tinkering to get Joe's (awesome!) answer implemented, but I'm having difficulties extracting the actual paw data from my files.\n\nThe coded_paws shows me all the different paws, when applied to the maximal pressure image (see above). However, the solution goes over each frame (to separate overlapping paws) and sets the four Rectangle attributes, such as coordinates or height/width. \nI can't figure out how to take these attributes and store them in some variable that I can apply to the measurement data. Since I need to know for each paw, what its location is during which frames and couple this to which paw it is (front/hind, left/right).\nSo how can I use the Rectangles attributes to extract these values for each paw?\nI have the measurements I used in the question setup in my public Dropbox folder (example 1, example 2, example 3). For anyone interested I also set up a blog to keep you up to date :-)\n", "output": "If you're just wanting (semi) contiguous regions, there's already an easy implementation in Python: SciPy's ndimage.morphology module. This is a fairly common image morphology operation. \n\nBasically, you have 5 steps:\ndef find_paws(data, smooth_radius=5, threshold=0.0001):\n data = sp.ndimage.uniform_filter(data, smooth_radius)\n thresh = data > threshold\n filled = sp.ndimage.morphology.binary_fill_holes(thresh)\n coded_paws, num_paws = sp.ndimage.label(filled)\n data_slices = sp.ndimage.find_objects(coded_paws)\n return object_slices\n\n\nBlur the input data a bit to make sure the paws have a continuous footprint. (It would be more efficient to just use a larger kernel (the structure kwarg to the various scipy.ndimage.morphology functions) but this isn't quite working properly for some reason...) \nThreshold the array so that you have a boolean array of places where the pressure is over some threshold value (i.e. thresh = data > value)\nFill any internal holes, so that you have cleaner regions (filled = sp.ndimage.morphology.binary_fill_holes(thresh))\nFind the separate contiguous regions (coded_paws, num_paws = sp.ndimage.label(filled)). This returns an array with the regions coded by number (each region is a contiguous area of a unique integer (1 up to the number of paws) with zeros everywhere else)).\nIsolate the contiguous regions using data_slices = sp.ndimage.find_objects(coded_paws). This returns a list of tuples of slice objects, so you could get the region of the data for each paw with [data[x] for x in data_slices]. Instead, we'll draw a rectangle based on these slices, which takes slightly more work.\n\n\nThe two animations below show your \"Overlapping Paws\" and \"Grouped Paws\" example data. This method seems to be working perfectly. (And for whatever it's worth, this runs much more smoothly than the GIF images below on my machine, so the paw detection algorithm is fairly fast...)\n\n\n\nHere's a full example (now with much more detailed explanations). The vast majority of this is reading the input and making an animation. The actual paw detection is only 5 lines of code.\nimport numpy as np\nimport scipy as sp\nimport scipy.ndimage\n\nimport matplotlib.pyplot as plt\nfrom matplotlib.patches import Rectangle\n\ndef animate(input_filename):\n \"\"\"Detects paws and animates the position and raw data of each frame\n in the input file\"\"\"\n # With matplotlib, it's much, much faster to just update the properties\n # of a display object than it is to create a new one, so we'll just update\n # the data and position of the same objects throughout this animation...\n\n infile = paw_file(input_filename)\n\n # Since we're making an animation with matplotlib, we need \n # ion() instead of show()...\n plt.ion()\n fig = plt.figure()\n ax = fig.add_subplot(111)\n fig.suptitle(input_filename)\n\n # Make an image based on the first frame that we'll update later\n # (The first frame is never actually displayed)\n im = ax.imshow(infile.next()[1])\n\n # Make 4 rectangles that we can later move to the position of each paw\n rects = [Rectangle((0,0), 1,1, fc='none', ec='red') for i in range(4)]\n [ax.add_patch(rect) for rect in rects]\n\n title = ax.set_title('Time 0.0 ms')\n\n # Process and display each frame\n for time, frame in infile:\n paw_slices = find_paws(frame)\n\n # Hide any rectangles that might be visible\n [rect.set_visible(False) for rect in rects]\n\n # Set the position and size of a rectangle for each paw and display it\n for slice, rect in zip(paw_slices, rects):\n dy, dx = slice\n rect.set_xy((dx.start, dy.start))\n rect.set_width(dx.stop - dx.start + 1)\n rect.set_height(dy.stop - dy.start + 1)\n rect.set_visible(True)\n\n # Update the image data and title of the plot\n title.set_text('Time %0.2f ms' % time)\n im.set_data(frame)\n im.set_clim([frame.min(), frame.max()])\n fig.canvas.draw()\n\ndef find_paws(data, smooth_radius=5, threshold=0.0001):\n \"\"\"Detects and isolates contiguous regions in the input array\"\"\"\n # Blur the input data a bit so the paws have a continous footprint \n data = sp.ndimage.uniform_filter(data, smooth_radius)\n # Threshold the blurred data (this needs to be a bit > 0 due to the blur)\n thresh = data > threshold\n # Fill any interior holes in the paws to get cleaner regions...\n filled = sp.ndimage.morphology.binary_fill_holes(thresh)\n # Label each contiguous paw\n coded_paws, num_paws = sp.ndimage.label(filled)\n # Isolate the extent of each paw\n data_slices = sp.ndimage.find_objects(coded_paws)\n return data_slices\n\ndef paw_file(filename):\n \"\"\"Returns a iterator that yields the time and data in each frame\n The infile is an ascii file of timesteps formatted similar to this:\n\n Frame 0 (0.00 ms)\n 0.0 0.0 0.0\n 0.0 0.0 0.0\n\n Frame 1 (0.53 ms)\n 0.0 0.0 0.0\n 0.0 0.0 0.0\n ...\n \"\"\"\n with open(filename) as infile:\n while True:\n try:\n time, data = read_frame(infile)\n yield time, data\n except StopIteration:\n break\n\ndef read_frame(infile):\n \"\"\"Reads a frame from the infile.\"\"\"\n frame_header = infile.next().strip().split()\n time = float(frame_header[-2][1:])\n data = []\n while True:\n line = infile.next().strip().split()\n if line == []:\n break\n data.append(line)\n return time, np.array(data, dtype=np.float)\n\nif __name__ == '__main__':\n animate('Overlapping paws.bin')\n animate('Grouped up paws.bin')\n animate('Normal measurement.bin')\n\n\nUpdate: As far as identifying which paw is in contact with the sensor at what times, the simplest solution is to just do the same analysis, but use all of the data at once. (i.e. stack the input into a 3D array, and work with it, instead of the individual time frames.) Because SciPy's ndimage functions are meant to work with n-dimensional arrays, we don't have to modify the original paw-finding function at all.\n# This uses functions (and imports) in the previous code example!!\ndef paw_regions(infile):\n # Read in and stack all data together into a 3D array\n data, time = [], []\n for t, frame in paw_file(infile):\n time.append(t)\n data.append(frame)\n data = np.dstack(data)\n time = np.asarray(time)\n\n # Find and label the paw impacts\n data_slices, coded_paws = find_paws(data, smooth_radius=4)\n\n # Sort by time of initial paw impact... This way we can determine which\n # paws are which relative to the first paw with a simple modulo 4.\n # (Assuming a 4-legged dog, where all 4 paws contacted the sensor)\n data_slices.sort(key=lambda dat_slice: dat_slice[2].start)\n\n # Plot up a simple analysis\n fig = plt.figure()\n ax1 = fig.add_subplot(2,1,1)\n annotate_paw_prints(time, data, data_slices, ax=ax1)\n ax2 = fig.add_subplot(2,1,2)\n plot_paw_impacts(time, data_slices, ax=ax2)\n fig.suptitle(infile)\n\ndef plot_paw_impacts(time, data_slices, ax=None):\n if ax is None:\n ax = plt.gca()\n\n # Group impacts by paw...\n for i, dat_slice in enumerate(data_slices):\n dx, dy, dt = dat_slice\n paw = i%4 + 1\n # Draw a bar over the time interval where each paw is in contact\n ax.barh(bottom=paw, width=time[dt].ptp(), height=0.2, \n left=time[dt].min(), align='center', color='red')\n ax.set_yticks(range(1, 5))\n ax.set_yticklabels(['Paw 1', 'Paw 2', 'Paw 3', 'Paw 4'])\n ax.set_xlabel('Time (ms) Since Beginning of Experiment')\n ax.yaxis.grid(True)\n ax.set_title('Periods of Paw Contact')\n\ndef annotate_paw_prints(time, data, data_slices, ax=None):\n if ax is None:\n ax = plt.gca()\n\n # Display all paw impacts (sum over time)\n ax.imshow(data.sum(axis=2).T)\n\n # Annotate each impact with which paw it is\n # (Relative to the first paw to hit the sensor)\n x, y = [], []\n for i, region in enumerate(data_slices):\n dx, dy, dz = region\n # Get x,y center of slice...\n x0 = 0.5 * (dx.start + dx.stop)\n y0 = 0.5 * (dy.start + dy.stop)\n x.append(x0); y.append(y0)\n\n # Annotate the paw impacts \n ax.annotate('Paw %i' % (i%4 +1), (x0, y0), \n color='red', ha='center', va='bottom')\n\n # Plot line connecting paw impacts\n ax.plot(x,y, '-wo')\n ax.axis('image')\n ax.set_title('Order of Steps')\n\n\n\n\n\n\n"} {"input": "which day of week given a date python I want to find out the following:\ngiven a date (datetime object), what is the corresponding day of the week.\nFor instance Sunday is the first day, Monday: second day.. and so on\nAnd then if the input is something like Today's date.\nThe output is maybe 6 (since its Friday)\n", "output": "Use weekday() (docs):\n>>> import datetime\n>>> datetime.datetime.today()\ndatetime.datetime(2012, 3, 23, 23, 24, 55, 173504)\n>>> datetime.datetime.today().weekday()\n4\n\nFrom the documentation:\n\nReturn the day of the week as an integer, where Monday is 0 and Sunday is 6.\n\n"} {"input": "Filter dict to contain only certain keys? I've got a dict that has a whole bunch of entries. I'm only interested in a select few of them. Is there an easy way to prune all the other ones out?\n", "output": "Constructing a new dict: \ndict_you_want = { your_key: old_dict[your_key] for your_key in your_keys }\n\nUses dictionary comprehension. \nIf you use a version which lacks them (ie Python 2.6 and earlier), make it dict((your_key, old_dict[your_key]) for ...). It's the same, though uglier.\nNote that this, unlike jnnnnn's version, has stable performance (depends only on number of your_keys) for old_dicts of any size. Both in terms of speed and memory. Since this is a generator expression, it processes one item at a time, and it doesn't looks through all items of old_dict.\nRemoving everything in-place:\nunwanted = set(keys) - set(your_dict)\nfor unwanted_key in unwanted: del your_dict[unwanted_key]\n\n"} {"input": "if else in a list comprehension I have a list l:\nl = [22, 13, 45, 50, 98, 69, 43, 44, 1]\n\nFor numbers above 45 inclusive, I would like to add 1; and for numbers less than it, 5.\nI tried\n[x+1 for x in l if x >= 45 else x+5]\n\nBut it gives me a syntax error. How can I achieve an if \u2013 else like this in a list comprehension?\n", "output": ">>> l = [22, 13, 45, 50, 98, 69, 43, 44, 1]\n>>> [x+1 if x >= 45 else x+5 for x in l]\n[27, 18, 46, 51, 99, 70, 48, 49, 6]\n\nDo-something if , else do-something else.\n"} {"input": "matplotlib Legend Markers Only Once I often plot a point on a matplotlib plot with:\nx = 10\ny = 100\nplot(x, y, \"k*\", label=\"Global Optimum\")\nlegend()\n\nHowever, this causes the legend to put a star in the legend twice, such that it looks like:\n* * Global Optimum\n\nwhen I really want it to look like:\n * Global Optimum\n\nHow do I do this?\n", "output": "This should work:\nlegend(numpoints=1)\n\nBTW, if you add the line\nlegend.numpoints : 1 # the number of points in the legend line\n\nto your matplotlibrc file, then this will be the new default.\n[See also scatterpoints, depending on your plot.]\nAPI: Link to API docs\n"} {"input": "Why is Python's \"append\" not \"push\"? Does anyone know why Python's list.append function is not called list.push given that there's already a list.pop that removes and returns the last element (that indexed at -1) and list.append semantic is consistent with that use?\n", "output": "Because \"append\" existed long before \"pop\" was thought of. Python 0.9.1 supported list.append in early 1991. By comparison, here's part of a discussion on comp.lang.python about adding pop in 1997. Guido wrote:\n\nTo implement a stack, one would need\n to add a list.pop() primitive (and\n no, I'm not against this particular\n one on the basis of any principle). \n list.push() could be added for\n symmetry with list.pop() but I'm not\n a big fan of multiple names for the\n same operation -- sooner or later\n you're going to read code that uses\n the other one, so you need to learn\n both, which is more cognitive load.\n\nYou can also see he discusses the idea of if push/pop/put/pull should be at element [0] or after element [-1] where he posts a reference to Icon's list:\n\nI stil think that all this is best\n left out of the list object \n implementation -- if you need a stack,\n or a queue, with particular \n semantics, write a little class that\n uses a lists\n\nIn other words, for stacks implemented directly as Python lists, which already supports fast append(), and del list[-1], it makes sense that list.pop() work by default on the last element. Even if other languages do it differently.\nImplicit here is that most people need to append to a list, but many fewer have occasion to treat lists as stacks, which is why list.append came in so much earlier.\n"} {"input": "How to get string objects instead of Unicode ones from JSON in Python? I'm using Python 2 to parse JSON from (ASCII encoded) text files. When loading these files with either json or simplejson, all my string values are cast to Unicode objects instead of string objects.\nThe problem is, I have to use the data with some libraries that only accept string objects. I can't change the libraries nor update them.\nIs it possible to get string objects instead of Unicode ones from json or simplejson?\nHere's a small example:\n>>> import json\n>>> original_list = ['a', 'b']\n>>> json_list = json.dumps(original_list)\n>>> json_list\n'[\"a\", \"b\"]'\n>>> new_list = json.loads(js)\n>>> new_list\n[u'a', u'b'] # I want these to be of type `str`, not `unicode`\n", "output": "While there are some good answers here, I ended up using PyYAML to parse my JSON files, since it gives the keys and values as str type strings instead of unicode type. Because JSON is a subset of YAML it works nicely:\n>>> import json\n>>> import yaml\n>>> list_org = ['a', 'b']\n>>> list_dump = json.dumps(list_org)\n>>> list_dump\n'[\"a\", \"b\"]'\n>>> json.loads(list_dump)\n[u'a', u'b']\n>>> yaml.safe_load(list_dump)\n['a', 'b']\n\nSome things to note though:\n\nI get string objects because all my entries are ASCII encoded. If I would use unicode encoded entries, I would get them back as unicode objects \u00e2\u0080\u0094 there is no conversion!\nYou should (probably always) use PyYAML's safe_load function; if you use it to load JSON files, you don't need the \"additional power\" of the load function anyway.\nIf you want a YAML parser that has more support for the 1.2 version of the spec (and correctly parses very low numbers) try Ruamel YAML: pip install ruamel.yaml and import ruamel.yaml as yaml was all I needed in my tests.\n\nAs stated, there is no conversion! If you can't be sure to only deal with ASCII values (and you can't be sure most of the time), better use a conversion function:\nI used the one from Mark Amery a couple of times now, it works great and is very easy to use. You can also use a similar function as an object_hook instead, as it might gain you a performance boost on big files. See the slightly more involved answer from Mirec Miskuf for that.\n"} {"input": "Python - How do I pass a string into subprocess.Popen (using the stdin argument)? If I do the following:\nimport subprocess\nfrom cStringIO import StringIO\nsubprocess.Popen(['grep','f'],stdout=subprocess.PIPE,stdin=StringIO('one\\ntwo\\nthree\\nfour\\nfive\\nsix\\n')).communicate()[0]\n\nI get:\nTraceback (most recent call last):\n File \"\", line 1, in ?\n File \"/build/toolchain/mac32/python-2.4.3/lib/python2.4/subprocess.py\", line 533, in __init__\n (p2cread, p2cwrite,\n File \"/build/toolchain/mac32/python-2.4.3/lib/python2.4/subprocess.py\", line 830, in _get_handles\n p2cread = stdin.fileno()\nAttributeError: 'cStringIO.StringI' object has no attribute 'fileno'\n\nApparently a cStringIO.StringIO object doesn't quack close enough to a file duck to suit subprocess.Popen. How do I work around this?\n", "output": "Popen.communicate() documentation:\n\nNote that if you want to send data to\n the process\u00e2\u0080\u0099s stdin, you need to\n create the Popen object with\n stdin=PIPE. Similarly, to get anything\n other than None in the result tuple,\n you need to give stdout=PIPE and/or\n stderr=PIPE too.\nReplacing os.popen*\n\n pipe = os.popen(cmd, 'w', bufsize)\n # ==>\n pipe = Popen(cmd, shell=True, bufsize=bufsize, stdin=PIPE).stdin\n\n\nWarning Use communicate() rather than\n stdin.write(), stdout.read() or\n stderr.read() to avoid deadlocks due\n to any of the other OS pipe buffers\n filling up and blocking the child\n process.\n\nSo your example could be written as follows:\nfrom subprocess import Popen, PIPE, STDOUT\n\np = Popen(['grep', 'f'], stdout=PIPE, stdin=PIPE, stderr=STDOUT) \ngrep_stdout = p.communicate(input=b'one\\ntwo\\nthree\\nfour\\nfive\\nsix\\n')[0]\nprint(grep_stdout.decode())\n# -> four\n# -> five\n# ->\n\n"} {"input": "How do I find out my python path using python? How do I find out which directories are listed in my system\u00e2\u0080\u0099s PYTHONPATH variable, from within a Python script (or the interactive shell)?\n", "output": "You would probably also want this:\nimport sys\nprint(sys.path)\n\nOr as a one liner from the terminal:\npython -c \"import sys; print '\\n'.join(sys.path)\"\n\n"} {"input": "Python dictionary from an object's fields Do you know if there is a built-in function to build a dictionary from an arbitrary object? I'd like to do something like this:\n>>> class Foo:\n... bar = 'hello'\n... baz = 'world'\n...\n>>> f = Foo()\n>>> props(f)\n{ 'bar' : 'hello', 'baz' : 'world' }\n\nNOTE: It should not include methods. Only fields.\nThanks\n", "output": "Note that best practice in current versions of Python is to use new-style classes, i.e.\nclass Foo(object):\n...\n\nAlso, there's a difference between an 'object' and a 'class'. To build a dictionary from an arbitrary object, it's sufficient to use __dict__. Usually, you'll declare your methods at class level and your attributes at instance level, so __dict__ should be fine. For example:\n>>> class A(object):\n... def __init__(self):\n... self.b = 1\n... self.c = 2\n... def do_nothing(self):\n... pass\n...\n>>> a = A()\n>>> a.__dict__\n{'c': 2, 'b': 1}\n\nAlternatively, depending on what you want to do, it might be nice to inherit from dict. Then your class is already a dictionary, and if you want you can override getattr and/or setattr to call through and set the dict. For example:\n class Foo(dict):\n def __init__(self):\n pass\n def __getattr__(self, attr):\n return self[attr]\n\n # etc...\n\n"} {"input": "Determine function name from within that function (without using traceback) In Python, without using the traceback module, is there a way to determine a function's name from within that function?\nSay I have a module foo with a function bar. When executing foo.bar(), is there a way for bar to know bar's name? Or better yet, foo.bar's name?\n#foo.py \ndef bar():\n print \"my name is\", __myname__ # <== how do I calculate this at runtime?\n\n", "output": "import inspect\n\ndef foo():\n print inspect.stack()[0][3]\n\n"} {"input": "How can I color Python logging output? Some time ago, I saw a Mono application with colored output, presumably because of its log system (because all the messages were standardized). \nNow, Python has the logging module, which lets you specify a lot of options to customize output. So, I'm imagining something similar would be possible with Python, but I can\u00e2\u0080\u0099t find out how to do this anywhere. \nIs there any way to make the Python logging module output in color? \nWhat I want (for instance) errors in red, debug messages in blue or yellow, and so on. \nOf course this would probably require a compatible terminal (most modern terminals are); but I could fallback to the original logging output if color isn't supported.\nAny ideas how I can get colored output with the logging module?\n", "output": "I already knew about the color escapes, I used them in my bash prompt a while ago. Thanks anyway.\nWhat I wanted was to integrate it with the logging module, which I eventually did after a couple of tries and errors.\nHere is what I end up with:\nBLACK, RED, GREEN, YELLOW, BLUE, MAGENTA, CYAN, WHITE = range(8)\n\n#The background is set with 40 plus the number of the color, and the foreground with 30\n\n#These are the sequences need to get colored ouput\nRESET_SEQ = \"\\033[0m\"\nCOLOR_SEQ = \"\\033[1;%dm\"\nBOLD_SEQ = \"\\033[1m\"\n\ndef formatter_message(message, use_color = True):\n if use_color:\n message = message.replace(\"$RESET\", RESET_SEQ).replace(\"$BOLD\", BOLD_SEQ)\n else:\n message = message.replace(\"$RESET\", \"\").replace(\"$BOLD\", \"\")\n return message\n\nCOLORS = {\n 'WARNING': YELLOW,\n 'INFO': WHITE,\n 'DEBUG': BLUE,\n 'CRITICAL': YELLOW,\n 'ERROR': RED\n}\n\nclass ColoredFormatter(logging.Formatter):\n def __init__(self, msg, use_color = True):\n logging.Formatter.__init__(self, msg)\n self.use_color = use_color\n\n def format(self, record):\n levelname = record.levelname\n if self.use_color and levelname in COLORS:\n levelname_color = COLOR_SEQ % (30 + COLORS[levelname]) + levelname + RESET_SEQ\n record.levelname = levelname_color\n return logging.Formatter.format(self, record)\n\nAnd to use it, create your own Logger:\n# Custom logger class with multiple destinations\nclass ColoredLogger(logging.Logger):\n FORMAT = \"[$BOLD%(name)-20s$RESET][%(levelname)-18s] %(message)s ($BOLD%(filename)s$RESET:%(lineno)d)\"\n COLOR_FORMAT = formatter_message(FORMAT, True)\n def __init__(self, name):\n logging.Logger.__init__(self, name, logging.DEBUG) \n\n color_formatter = ColoredFormatter(self.COLOR_FORMAT)\n\n console = logging.StreamHandler()\n console.setFormatter(color_formatter)\n\n self.addHandler(console)\n return\n\n\nlogging.setLoggerClass(ColoredLogger)\n\nJust in case anyone else needs it.\nBe careful if you're using more than one logger or handler: ColoredFormatter is changing the record object, which is passed further to other handlers or propagated to other loggers. If you have configured file loggers etc. you probably don't want to have the colors in the log files. To avoid that, it's probably best to simply create a copy of record with copy.copy() before manipulating the levelname attribute, or to reset the levelname to the previous value, before returning the formatted string (credit to Michael in the comments).\n"} {"input": "Split string on whitespace in Python I'm looking for the Python equivalent of \nString str = \"many fancy word \\nhello \\thi\";\nString whiteSpaceRegex = \"\\\\s\";\nString[] words = str.split(whiteSpaceRegex);\n\n[\"many\", \"fancy\", \"word\", \"hello\", \"hi\"]\n\n", "output": "The str.split() method without an argument splits on whitespace:\n>>> \"many fancy word \\nhello \\thi\".split()\n['many', 'fancy', 'word', 'hello', 'hi']\n\n"} {"input": "Get HTML Source of WebElement in Selenium WebDriver using Python I'm using the Python bindings to run Selenium WebDriver.\nfrom selenium import webdriver\nwd = webdriver.Firefox()\n\nI know I can grab a webelement like so...\nelem = wd.find_element_by_css_selector('#my-id')\n\nAnd I know I can get the full page source with...\nwd.page_source\n\nBut is there anyway to get the \"element source\"?\nelem.source # <-- returns the HTML as a string\n\nThe selenium webdriver docs for Python are basically non-existent and I don't see anything in the code that seems to enable that functionality.\nAny thoughts on the best way to access the HTML of an element (and its children)?\n", "output": "You can read innerHTML attribute to get source of the content of the element or outerHTML for source with the current element.\nPython:\nelement.get_attribute('innerHTML')\n\nJava:\nelem.getAttribute(\"innerHTML\");\n\nC#:\nelement.GetAttribute(\"innerHTML\");\n\nRuby:\nelement.attribute(\"innerHTML\")\n\nTested and works with the ChromeDriver.\n"} {"input": "Get current time in milliseconds in Python? How can I get the current time in milliseconds in Python? \n", "output": "For what I needed, here's what I did, based on @samplebias' comment above:\nimport time\nmillis = int(round(time.time() * 1000))\nprint millis\n\nQuick'n'easy. Thanks all, sorry for the brain fart.\nFor reuse:\nimport time\n\ncurrent_milli_time = lambda: int(round(time.time() * 1000))\n\nThen:\n>>> current_milli_time()\n1378761833768\n\n"} {"input": "How can I make sense of the `else` statement in Python loops? Many Python programmers are probably unaware that the syntax of while loops and for loops includes an optional else: clause:\nfor val in iterable:\n do_something(val)\nelse:\n clean_up()\n\nThe body of the else clause is a good place for certain kinds of clean-up actions, and is executed on normal termination of the loop: I.e., exiting the loop with return or break skips the else clause; exiting after a continue executes it. I know this only because I just looked it up (yet again), because I can never remember when the else clause is executed.\nAlways? On \"failure\" of the loop, as the name suggests? On regular termination? Even if the loop is exited with return? I can never be entirely sure without looking it up.\nI blame my persisting uncertainty on the choice of keyword: I find else incredibly unmnemonic for this semantics. My question is not \"why is this keyword used for this purpose\" (which I would probably vote to close, though only after reading the answers and comments), but how can I think about the else keyword so that its semantics make sense, and I can therefore remember it?\nI'm sure there was a fair amount of discussion about this, and I can imagine that the choice was made for consistency with the try statement's else: clause (which I also have to look up), and with the goal of not adding to the list of Python's reserved words. Perhaps the reasons for choosing else will clarify its function and make it more memorable, but I'm after connecting name to function, not after historical explanation per se.\nThe answers to this question, which my question was briefly closed as a duplicate of, contain a lot of interesting back story. My question has a different focus (how to connect the specific semantics of else with the keyword choice), but I feel there should be a link to this question somewhere.\n", "output": "(This is inspired by @Mark Tolonen's answer.)\nAn if statement runs its else clause if its condition evaluates to false.\nIdentically, a while loop runs the else clause if its condition evaluates to false.\nThis rule matches the behavior you described:\n\nIn normal execution, the while loop repeatedly runs until the condition evaluates to false, and therefore naturally exiting the loop runs the else clause.\nWhen you execute a break statement, you exit out of the loop without evaluating the condition, so the condition cannot evaluate to false and you never run the else clause.\nWhen you execute a continue statement, you evaluate the condition again, and do exactly what you normally would at the beginning of a loop iteration.\nSo, if the condition is true, you keep looping, but if it is false you run the else clause.\nOther methods of exiting the loop, such as return, do not evaluate the condition and therefore do not run the else clause.\n\nfor loops behave the same way. Just consider the condition as true if the iterator has more elements, or false otherwise.\n"} {"input": "Relationship between scipy and numpy scipy appears to provide most (but not all [1]) of numpy's functions in its own namespace. In other words, if there's a function named numpy.foo, there's almost certainly a scipy.foo. Most of the time, the two appear to be exactly the same, oftentimes even pointing to the same function object.\nSometimes, they're different. To give an example that came up recently:\n\nnumpy.log10 is a ufunc that returns NaNs for negative arguments;\nscipy.log10 returns complex values for negative arguments, and doesn't appear to be a ufunc.\n\nThe same can be said about log, log2 and logn, but not about log1p [2].\nOn the other hand, numpy.exp and scipy.exp appear to be different names for the same ufunc. This is also true of scipy.log1p and numpy.log1p.\nAnother example is numpy.linalg.solve vs scipy.linalg.solve. They're similar, but the latter offers some additional features over the former.\nWhy the apparent duplication? If this is meant to be a wholesale import of numpy into the scipy namespace, why the subtle differences in behaviour and the missing functions? Is there some overarching logic that would help clear up the confusion?\n[1] numpy.min, numpy.max, numpy.abs and a few others have no counterparts in the scipy namespace.\n[2] Tested using numpy 1.5.1 and scipy 0.9.0rc2.\n", "output": "Last time I checked it, the scipy __init__ method executes a\nfrom numpy import *\n\nso that the whole numpy namespace is included into scipy when the scipy module is imported.\nThe log10 behaviour you are describing is interesting, because both versions are coming from numpy. One is a ufunc, the other is a numpy.lib function. Why scipy is preferring the library function over the ufunc, I don't know off the top of my head.\n\nEDIT: In fact, I can answer the log10 question. Looking in the scipy __init__ method I see this:\n# Import numpy symbols to scipy name space\nimport numpy as _num\nfrom numpy import oldnumeric\nfrom numpy import *\nfrom numpy.random import rand, randn\nfrom numpy.fft import fft, ifft\nfrom numpy.lib.scimath import *\n\nThe log10 function you get in scipy comes from numpy.lib.scimath. Looking at that code, it says:\n\"\"\"\nWrapper functions to more user-friendly calling of certain math functions\nwhose output data-type is different than the input data-type in certain\ndomains of the input.\n\nFor example, for functions like log() with branch cuts, the versions in this\nmodule provide the mathematically valid answers in the complex plane:\n\n>>> import math\n>>> from numpy.lib import scimath\n>>> scimath.log(-math.exp(1)) == (1+1j*math.pi)\nTrue\n\nSimilarly, sqrt(), other base logarithms, power() and trig functions are\ncorrectly handled. See their respective docstrings for specific examples.\n\"\"\"\n\nIt seems that module overlays the base numpy ufuncs for sqrt, log, log2, logn, log10, power, arccos, arcsin, and arctanh. That explains the behaviour you are seeing. The underlying design reason why it is done like that is probably buried in a mailing list post somewhere.\n"} {"input": "How to split a string into array of characters with Python? I've tried to look around the web for answers to splitting a string into an array of characters but I can't seem to find a simple method\nstr.split(//) does not seem to work like Ruby does. Is there a simple way of doing this without looping?\n", "output": ">>> s = \"foobar\"\n>>> list(s)\n['f', 'o', 'o', 'b', 'a', 'r']\n\nYou need list\n"} {"input": "Dump a NumPy array into a csv file Is there a way to dump a NumPy array into a CSV file? I have a 2D NumPy array and need to dump it in human-readable format.\n", "output": "numpy.savetxt saves an array to a text file.\nimport numpy\na = numpy.asarray([ [1,2,3], [4,5,6], [7,8,9] ])\nnumpy.savetxt(\"foo.csv\", a, delimiter=\",\")\n\n"} {"input": "Change one character in a string in Python? What is the easiest way in Python to replace a character in a string like: \ntext = \"abcdefg\";\ntext[1] = \"Z\";\n ^\n\n", "output": "Don't modify strings.\nWork with them as lists; turn them into strings only when needed.\n>>> s = list(\"Hello zorld\")\n>>> s\n['H', 'e', 'l', 'l', 'o', ' ', 'z', 'o', 'r', 'l', 'd']\n>>> s[6] = 'W'\n>>> s\n['H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd']\n>>> \"\".join(s)\n'Hello World'\n\nPython strings are immutable (i.e. they can't be modified). There are a lot of reasons for this. Use lists until you have no choice, only then turn them into strings.\n"} {"input": "Iterating through a range of dates in Python I have the following code to do this, but how can I do it better? Right now I think it's better than nested loops, but it starts to get Perl-one-linerish when you have a generator in a list comprehension. \nday_count = (end_date - start_date).days + 1\nfor single_date in [d for d in (start_date + timedelta(n) for n in range(day_count)) if d <= end_date]:\n print strftime(\"%Y-%m-%d\", single_date.timetuple())\n\nNotes\n\nI'm not actually using this to print. That's just for demo purposes. \nThe start_date and end_date variables are datetime.date objects because I don't need the timestamps. (They're going to be used to generate a report).\n\nSample Output\nFor a start date of 2009-05-30 and an end date of 2009-06-09:\n2009-05-30\n2009-05-31\n2009-06-01\n2009-06-02\n2009-06-03\n2009-06-04\n2009-06-05\n2009-06-06\n2009-06-07\n2009-06-08\n2009-06-09\n\n", "output": "Why are there two nested iterations? For me it produces the same list of data with only one iteration:\nfor single_date in (start_date + timedelta(n) for n in range(day_count)):\n print ...\n\nAnd no list gets stored, only one generator is iterated over. Also the \"if\" in the generator seems to be unnecessary.\nAfter all, a linear sequence should only require one iterator, not two.\nUpdate after discussion with John Machin:\nMaybe the most elegant solution is using a generator function to completely hide/abstract the iteration over the range of dates:\nfrom datetime import timedelta, date\n\ndef daterange(start_date, end_date):\n for n in range(int ((end_date - start_date).days)):\n yield start_date + timedelta(n)\n\nstart_date = date(2013, 1, 1)\nend_date = date(2015, 6, 2)\nfor single_date in daterange(start_date, end_date):\n print single_date.strftime(\"%Y-%m-%d\")\n\nNB: For consistency with the built-in range() function this iteration stops before reaching the end_date. So for inclusive iteration use the next day, as you would with range().\n"} {"input": "How to set environment variables in Python I need to set some environment variables in the python script and I want all the other scripts that are called from python (shell scripts) which will be child process to see the environment variables set. The value is a number.\nIf I do os.environ[\"DEBUSSY\"] = 1, it complains saying that 1 has to be string.\nI also want to know how to read the environment variables in python (in the later part of the script) once I set it. \n", "output": "Environment variables must be strings, so use\nos.environ[\"DEBUSSY\"] = \"1\"\n\nto set the variable DEBUSSY to the string 1. To access this variable later, simply use\nprint os.environ[\"DEBUSSY\"]\n\nChild processes automatically inherit the environment of the parent process -- no special action on your part is required.\n"} {"input": "Python String and Integer concatenation I want to create string using integer appended to it, in a for loop. Like this:\nfor i in range [1,10]:\n string=\"string\"+i\n\nBut it returns an error:\nTypeError: unsupported operand type(s) for +: 'int' and 'str'\n\nWhat's the best way to concatenate the String and Integer?\n", "output": "for i in range (1,10):\n string=\"string\"+str(i)\n\nTo get string0, string1 ..... string10, you could do like\n>>> [\"string\"+str(i) for i in range(11)]\n['string0', 'string1', 'string2', 'string3', 'string4', 'string5', 'string6', 'string7', 'string8', 'string9', 'string10']\n\n"} {"input": "How do I update pip itself from inside my virtual environment? I'm able to update pip-managed packages, but how do I update pip itself? According to pip --version, I currently have pip 1.1 installed in my virtualenv and I want to update to the latest version. \nWhat's the command for that? Do I need to use distribute or is there a native pip or virtualenv command? I've already tried pip update and pip update pip with no success.\n", "output": "pip is just a PyPI package like any other; you could use it to upgrade itself the same way you would upgrade any package:\npip install --upgrade pip\n\nOn Windows the recommended command is:\npython -m pip install --upgrade pip\n\n"} {"input": "How can I create a directly-executable cross-platform GUI app using Python? Python works on multiple platforms and can be used for desktop and web applications, thus I conclude that there is some way to compile it into an executable for Mac, Windows and Linux.\nThe problem being I have no idea where to start or how to write a GUI with it, can anybody shed some light on this and point me in the right direction please?", "output": "First you will need some GUI library with Python bindings and then (if you want) some program that will convert your python scripts into standalone executables.\nCross-platform GUI libraries with Python bindings (Windows, Linux, Mac)\nOf course, there are many, but the most popular that I've seen in wild are:\n\nTkinter - based on Tk GUI toolkit (de-facto standard GUI library for python, free for commercial projects)\nWxPython - based on WxWidgets (very popular, free for commercial projects)\nPyQt - based on Qt (also very popular and more stable than WxWidgets but costly license for commercial projects)\n\nComplete list is at http://wiki.python.org/moin/GuiProgramming\nSingle executable (Windows)\n\npy2exe - Probably the most popular out there (PyInstaller is also gaining in popularity)\n\nSingle executable (Linux)\n\nFreeze - works the same way like py2exe but targets Linux platform\n\nSingle executable (Mac)\n\npy2app - again, works like py2exe but targets Mac OS\n\n"} {"input": "Delete Folder Contents in Python How can I delete the contents of a local folder in Python?\nThe current project is for Windows but I would like to see *nix also.\n", "output": "Updated to only delete files and to used the os.path.join() method suggested in the comments. If you also want to remove subdirectories, uncomment the elif statement.\nimport os, shutil\nfolder = '/path/to/folder'\nfor the_file in os.listdir(folder):\n file_path = os.path.join(folder, the_file)\n try:\n if os.path.isfile(file_path):\n os.unlink(file_path)\n #elif os.path.isdir(file_path): shutil.rmtree(file_path)\n except Exception as e:\n print(e)\n\n"} {"input": "Get list from pandas DataFrame column headers I want to get a list of the column headers from a pandas DataFrame. The DataFrame will come from user input so I won't know how many columns there will be or what they will be called.\nFor example, if I'm given a DataFrame like this:\n>>> my_dataframe\n y gdp cap\n0 1 2 5\n1 2 3 9\n2 8 7 2\n3 3 4 7\n4 6 7 7\n5 4 8 3\n6 8 2 8\n7 9 9 10\n8 6 6 4\n9 10 10 7\n\nI would want to get a list like this:\n>>> header_list\n[y, gdp, cap]\n\n", "output": "You can get the values as a list by doing:\nlist(my_dataframe.columns.values)\n\n"} {"input": "SSL InsecurePlatform error when using Requests package Im using Python 2.7.3 and Requests. I installed Requests via pip. I believe it's the latest version. I'm running on Debian Wheezy.\nI've used Requests lots of times in the past and never faced this issue, but it seems that when making https requests with Requests I get an InsecurePlatform exception.\nThe error mentions urllib3, but I don't have that installed. I did install it to check if it resolved the error, but it didn't.\n/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3\n/util/ssl_.py:79: InsecurePlatformWarning: A true SSLContext object is not\navailable. This prevents urllib3 from configuring SSL appropriately and \nmay cause certain SSL connections to fail. For more information, see \nhttps://urllib3.readthedocs.org/en/latest \n/security.html#insecureplatformwarning.\n\nAny ideas as to why I'm getting this? I've checked the docs, as specified in the error message, but the docs are saying to import urllib3 and either disable the warning, or provide a certificate.\n", "output": "Use the somewhat hidden security feature:\npip install 'requests[security]'\nor\npip install pyOpenSSL ndg-httpsclient pyasn1 \nBoth commands install following extra packages:\n\npyOpenSSL\nndg-httpsclient\npyasn1\n\nPlease note that this is not required for python-2.7.9+.\nIf pip install fails with errors, check whether you have required development packages for libffi, libssl and python installed in your system using distribution's package manager:\n\nDebian/Ubuntu - python-dev libffi-dev libssl-dev packages.\nFedora - openssl-devel python-devel libffi-devel packages.\n\nDistro list above is incomplete.\nWorkaround (see the original answer by @TomDotTom):\nIn case you cannot install some of the required development packages, there's also an option to disable that warning:\nimport requests.packages.urllib3\nrequests.packages.urllib3.disable_warnings()\n\n"} {"input": "Read .mat files in Python Does anyone have successful experience reading binary Matlab .mat files in Python?\n(I've seen that scipy has alleged support for reading .mat files, but I'm unsuccessful with it. I installed scipy version 0.7.0, and I can't find the loadmat() method)\n", "output": "Silly me. Forgot to import io...\nimport scipy.io\nmat = scipy.io.loadmat('file.mat')\n\n"} {"input": "How can I get a list of all classes within current module in Python? I've seen plenty of examples of people extracting all of the classes from a module, usually something like:\n# foo.py\nclass Foo:\n pass\n\n# test.py\nimport inspect\nimport foo\n\nfor name, obj in inspect.getmembers(foo):\n if inspect.isclass(obj):\n print obj\n\nAwesome.\nBut I can't find out how to get all of the classes from the current module.\n# foo.py\nimport inspect\n\nclass Foo:\n pass\n\ndef print_classes():\n for name, obj in inspect.getmembers(???): # what do I do here?\n if inspect.isclass(obj):\n print obj\n\n# test.py\nimport foo\n\nfoo.print_classes()\n\nThis is probably something really obvious, but I haven't been able to find anything. Can anyone help me out?\n", "output": "Try this:\nimport sys\ncurrent_module = sys.modules[__name__]\n\nIn your context:\nimport sys, inspect\ndef print_classes():\n for name, obj in inspect.getmembers(sys.modules[__name__]):\n if inspect.isclass(obj):\n print(obj)\n\nAnd even better:\nclsmembers = inspect.getmembers(sys.modules[__name__], inspect.isclass)\n\nBecause inspect.getmembers() takes a predicate.\n"} {"input": "Python try...except comma vs 'as' in except What is the difference between ',' and 'as' in except statements, eg:\ntry:\n pass\nexcept Exception, exception:\n pass\n\nand:\ntry:\n pass\nexcept Exception as exception:\n pass\n\nIs the second syntax legal in 2.6? It works in CPython 2.6 on Windows but the 2.5 interpreter in cygwin complains that it is invalid.\nIf they are both valid in 2.6 which should I use?\n", "output": "The definitive document is PEP-3110: Catching Exceptions\nSummary:\n\nIn Python 3.x, using as is required.\nIn Python 2.6+, use the as syntax, since it is far less ambiguous and forward compatible with Python 3.x.\nIn Python 2.5 and earlier, use the comma version, since as isn't supported.\n\n"} {"input": "Search and replace a line in a file in Python I want to loop over the contents of a text file and do a search and replace on some lines and write the result back to the file. I could first load the whole file in memory and then write it back, but that probably is not the best way to do it.\nWhat is the best way to do this, within the following code?\nf = open(file)\nfor line in f:\n if line.contains('foo'):\n newline = line.replace('foo', 'bar')\n # how to write this newline back to the file\n\n", "output": "The shortest way would probably be to use the fileinput module. For example, the following adds line numbers to a file, in-place:\nimport fileinput\n\nfor line in fileinput.input(\"test.txt\", inplace=True):\n print \"%d: %s\" % (fileinput.filelineno(), line),\n\nWhat happens here is:\n\nThe original file is moved to a backup file\nThe standard output is redirected to the original file within the loop\nThus any print statements write back into the original file\n\nfileinput has more bells and whistles. For example, it can be used to automatically operate on all files in sys.args[1:], without your having to iterate over them explicitly. Starting with Python 3.2 it also provides a convenient context manager for use in a with statement.\n\nWhile fileinput is great for throwaway scripts, I would be wary of using it in real code because admittedly it's not very readable or familiar. In real (production) code it's worthwhile to spend just a few more lines of code to make the process explicit and thus make the code readable.\nThere are two options:\n\nThe file is not overly large, and you can just read it wholly to memory. Then close the file, reopen it in writing mode and write the modified contents back.\nThe file is too large to be stored in memory; you can move it over to a temporary file and open that, reading it line by line, writing back into the original file. Note that this requires twice the storage.\n\n"} {"input": "What is a \"callable\" in Python? Now that it's clear what a metaclass is, there is an associated concept that I use all the time without knowing what it really means. \nI suppose everybody made once a mistake with parenthesis, resulting in an \"object is not callable\" exception. What's more, using __init__ and __new__ lead to wonder what this bloody __call__ can be used for.\nCould you give me some explanations, including examples with the magic method ?\n", "output": "A callable is anything that can be called. \nThe built-in callable (PyCallable_Check in objects.c) checks if the argument is either:\n\nan instance of a class with a __call__ method or\nis of a type that has a non null tp_call (c struct) member which indicates callability otherwise (such as in functions, methods etc.)\n\nThe method named __call__ is (according to the documentation)\n\nCalled when the instance is ''called'' as a function\n\nExample\nclass Foo:\n def __call__(self):\n print 'called'\n\nfoo_instance = Foo()\nfoo_instance() #this is calling the __call__ method\n\n"} {"input": "Best way to convert string to bytes in Python 3? There appears to be two different ways to convert a string to bytes, as seen in the answers to TypeError: 'str' does not support the buffer interface\nWhich of these methods would be better or more Pythonic? Or is it just a matter of personal preference?\nb = bytes(mystring, 'utf-8')\n\nb = mystring.encode('utf-8')\n\n", "output": "If you look at the docs for bytes, it points you to bytearray:\n\nbytearray([source[, encoding[, errors]]])\nReturn a new array of bytes. The bytearray type is a mutable sequence of integers in the range 0 <= x < 256. It has most of the usual methods of mutable sequences, described in Mutable Sequence Types, as well as most methods that the bytes type has, see Bytes and Byte Array Methods.\nThe optional source parameter can be used to initialize the array in a few different ways:\nIf it is a string, you must also give the encoding (and optionally, errors) parameters; bytearray() then converts the string to bytes using str.encode().\nIf it is an integer, the array will have that size and will be initialized with null bytes.\nIf it is an object conforming to the buffer interface, a read-only buffer of the object will be used to initialize the bytes array.\nIf it is an iterable, it must be an iterable of integers in the range 0 <= x < 256, which are used as the initial contents of the array.\nWithout an argument, an array of size 0 is created.\n\nSo bytes can do much more than just encode a string. It's Pythonic that it would allow you to call the constructor with any type of source parameter that makes sense. \nFor encoding a string, I think that some_string.encode(encoding) is more Pythonic than using the constructor, because it is the most self documenting -- \"take this string and encode it with this encoding\" is clearer than bytes(some_string, encoding) -- there is no explicit verb when you use the constructor.\nEdit: I checked the Python source. If you pass a unicode string to bytes using CPython, it calls PyUnicode_AsEncodedString, which is the implementation of encode; so you're just skipping a level of indirection if you call encode yourself.\nAlso, see Serdalis' comment -- unicode_string.encode(encoding) is also more Pythonic because its inverse is byte_string.decode(encoding) and symmetry is nice.\n"} {"input": "Python unittest - opposite of assertRaises? I want to write a test to establish that an Exception is not raised in a given circumstance. \nIt's straightforward to test if an Exception is raised ...\nsInvalidPath=AlwaysSuppliesAnInvalidPath()\nself.assertRaises(PathIsNotAValidOne, MyObject, sInvalidPath) \n\n... but how can you do the opposite. \nSomething like this i what I'm after ...\nsValidPath=AlwaysSuppliesAValidPath()\nself.assertNotRaises(PathIsNotAValidOne, MyObject, sValidPath) \n\n", "output": "try:\n myFunc()\nexcept ExceptionType:\n self.fail(\"myFunc() raised ExceptionType unexpectedly!\")\n\n"} {"input": "Most pythonic way to delete a file which may not exist I want to delete the file filename if it exists. Is it proper to say\nif os.path.exists(filename):\n os.remove(filename)\n\nIs there a better way? A one-line way? \n", "output": "A more pythonic way would be:\ntry:\n os.remove(filename)\nexcept OSError:\n pass\n\nAlthough this takes even more lines and looks very ugly, it avoids the unnecessary call to os.path.exists() and follows the python convention of overusing exceptions.\nIt may be worthwhile to write a function to do this for you:\nimport os, errno\n\ndef silentremove(filename):\n try:\n os.remove(filename)\n except OSError as e: # this would be \"except OSError, e:\" before Python 2.6\n if e.errno != errno.ENOENT: # errno.ENOENT = no such file or directory\n raise # re-raise exception if a different error occured\n\n"} {"input": "How to clear python interpreter console? Like most Python developers, I typically keep a console window open with the Python interpreter running to test commands, dir() stuff, help() stuff, etc.\nLike any console, after a while the visible backlog of past commands and prints gets to be cluttered, and sometimes confusing when re-running the same command several times. I'm wondering if, and how, to clear the Python interpreter console.\nI've heard about doing a system call and either calling cls on Windows or clear on Linux, but I was hoping there was something I could command the interpreter itself to do.\nNote: I'm running on Windows, so Ctrl+L doesn't work.\n", "output": "As you mentioned, you can do a system call:\n>>> import os\n>>> clear = lambda: os.system('cls')\n>>> clear()\n\nI am not sure of any other way in Windows.\n"} {"input": "How do I keep Python print from adding newlines or spaces? In python, if I say\nprint 'h'\n\nI get the letter h and a newline. If I say \nprint 'h',\n\nI get the letter h and no newline. If I say\nprint 'h',\nprint 'm',\n\nI get the letter h, a space, and the letter m. How can I prevent Python from printing the space?\nThe print statements are different iterations of the same loop so I can't just use the + operator.\n", "output": "Just a comment. In Python 3, you will use\nprint('h', end='')\n\nto suppress the endline terminator, and\nprint('a', 'b', 'c', sep='')\n\nto suppress the whitespace separator between items.\n"} {"input": "Is there a reason Python strings don't have a string length method? I know that python has a len() function that is used to determine the size of a string, but I was wondering why its not a method of the string object.\nUpdate\nOk, I realized I was embarrassingly mistaken. __len__() is actually a method of a string object. It just seems weird to see object oriented code in Python using the len function on string objects. Furthermore, it's also weird to see __len__ as the name instead of just len.\n", "output": "Strings do have a length method: __len__()\nThe protocol in Python is to implement this method on objects which have a length and use the built-in len() function, which calls it for you, similar to the way you would implement __iter__() and use the built-in iter() function (or have the method called behind the scenes for you) on objects which are iterable.\nSee Emulating container types for more information.\nHere's a good read on the subject of protocols in Python: Python and the Principle of Least Astonishment\n"} {"input": "Python - The difference between sys.stdout.write and print Are there situations in which sys.stdout.write() is preferable to print? \n(Examples: better performance; code that makes more sense)\n", "output": "print is just a thin wrapper that formats the inputs (space between args and newline at the end) and calls the write function of a given object. By default this object is sys.stdout, but you can pass a file for example:\nprint >> open('file.txt', 'w'), 'Hello', 'World', 2+3\n\nIn Python 3.x, print becomes a function, but it is still possible to pass something else than sys.stdout. See http://docs.python.org/library/functions.html.\nIn Python 2.6+, print is still a statement, but it can be used as a function with\nfrom __future__ import print_function\n\n\nUpdate: There is a little difference between the print function and the print statement (and more generally between a function and a statement) pointed by Bakuriu in comments.\nIn case of error when evaluating arguments:\nprint \"something\", 1/0, \"other\" #prints only something because 1/0 raise an Exception\n\nprint(\"something\", 1/0, \"other\") #doesn't print anything. The func is not called\n\n"} {"input": "How to change the order of DataFrame columns? I have the following DataFrame (df):\nimport numpy as np\nimport pandas as pd\n\ndf = pd.DataFrame(np.random.rand(10, 5))\n\nI add more column(s) by assignment:\ndf['mean'] = df.mean(1)\n\nHow can I move the column mean to the front, i.e. set it as first column leaving the order of the other columns untouched?\n", "output": "One easy way would be to reassign the dataframe with a list of the columns, rearranged as needed. \nThis is what you have now: \nIn [6]: df\nOut[6]:\n 0 1 2 3 4 mean\n0 0.445598 0.173835 0.343415 0.682252 0.582616 0.445543\n1 0.881592 0.696942 0.702232 0.696724 0.373551 0.670208\n2 0.662527 0.955193 0.131016 0.609548 0.804694 0.632596\n3 0.260919 0.783467 0.593433 0.033426 0.512019 0.436653\n4 0.131842 0.799367 0.182828 0.683330 0.019485 0.363371\n5 0.498784 0.873495 0.383811 0.699289 0.480447 0.587165\n6 0.388771 0.395757 0.745237 0.628406 0.784473 0.588529\n7 0.147986 0.459451 0.310961 0.706435 0.100914 0.345149\n8 0.394947 0.863494 0.585030 0.565944 0.356561 0.553195\n9 0.689260 0.865243 0.136481 0.386582 0.730399 0.561593\n\nIn [7]: cols = df.columns.tolist()\n\nIn [8]: cols\nOut[8]: [0L, 1L, 2L, 3L, 4L, 'mean']\n\nRearrange cols in any way you want. This is how I moved the last element to the first position: \nIn [12]: cols = cols[-1:] + cols[:-1]\n\nIn [13]: cols\nOut[13]: ['mean', 0L, 1L, 2L, 3L, 4L]\n\nThen reorder the dataframe like this: \nIn [16]: df = df[cols] # OR df = df.ix[:, cols]\n\nIn [17]: df\nOut[17]:\n mean 0 1 2 3 4\n0 0.445543 0.445598 0.173835 0.343415 0.682252 0.582616\n1 0.670208 0.881592 0.696942 0.702232 0.696724 0.373551\n2 0.632596 0.662527 0.955193 0.131016 0.609548 0.804694\n3 0.436653 0.260919 0.783467 0.593433 0.033426 0.512019\n4 0.363371 0.131842 0.799367 0.182828 0.683330 0.019485\n5 0.587165 0.498784 0.873495 0.383811 0.699289 0.480447\n6 0.588529 0.388771 0.395757 0.745237 0.628406 0.784473\n7 0.345149 0.147986 0.459451 0.310961 0.706435 0.100914\n8 0.553195 0.394947 0.863494 0.585030 0.565944 0.356561\n9 0.561593 0.689260 0.865243 0.136481 0.386582 0.730399\n\n"} {"input": "How can I compare two lists in python and return matches I want to take two lists and find the values that appear in both.\na = [1, 2, 3, 4, 5]\nb = [9, 8, 7, 6, 5]\n\nreturnMatches(a, b)\n\nwould return [5], for instance.\n", "output": "Not the most efficient one, but by far the most obvious way to do it is:\n>>> a = [1, 2, 3, 4, 5]\n>>> b = [9, 8, 7, 6, 5]\n>>> set(a) & set(b)\n{5}\n\nif order is significant you can do it with list comprehensions like this:\n>>> [i for i, j in zip(a, b) if i == j]\n[5]\n\n(only works for equal-sized lists, which order-significance implies).\n"} {"input": "Post JSON using Python Requests I need to POST a JSON from a client to a server. I'm using Python 2.7.1 and simplejson. The client is using Requests. The server is CherryPy. I can GET a hard-coded JSON from the server (code not shown), but when I try to POST a JSON to the server, I get \"400 Bad Request\".\nHere is my client code:\ndata = {'sender': 'Alice',\n 'receiver': 'Bob',\n 'message': 'We did it!'}\ndata_json = simplejson.dumps(data)\npayload = {'json_payload': data_json}\nr = requests.post(\"http://localhost:8080\", data=payload)\n\nHere is the server code.\nclass Root(object):\n\n def __init__(self, content):\n self.content = content\n print self.content # this works\n\n exposed = True\n\n def GET(self):\n cherrypy.response.headers['Content-Type'] = 'application/json'\n return simplejson.dumps(self.content)\n\n def POST(self):\n self.content = simplejson.loads(cherrypy.request.body.read())\n\nAny ideas?\n", "output": "It turns out I was missing the header information. The following works:\nurl = \"http://localhost:8080\"\ndata = {'sender': 'Alice', 'receiver': 'Bob', 'message': 'We did it!'}\nheaders = {'Content-type': 'application/json', 'Accept': 'text/plain'}\nr = requests.post(url, data=json.dumps(data), headers=headers)\n\n"} {"input": "Select rows from a DataFrame based on values in a column in pandas How to select rows from a DataFrame based on values in some column in pandas?\nIn SQL I would use: \nselect * from table where colume_name = some_value. \n\nI tried to look at pandas documentation but did not immediately find the answer.\n", "output": "To select rows whose column value equals a scalar, some_value, use ==:\ndf.loc[df['column_name'] == some_value]\n\nTo select rows whose column value is in an iterable, some_values, use isin:\ndf.loc[df['column_name'].isin(some_values)]\n\n\nTo select rows whose column value does not equal some_value, use !=:\ndf.loc[df['column_name'] != some_value]\n\nisin returns a boolean Series, so to select rows whose value is not in some_values, negate the boolean Series using ~:\ndf.loc[~df['column_name'].isin(some_values)]\n\n\nFor example,\nimport pandas as pd\nimport numpy as np\ndf = pd.DataFrame({'A': 'foo bar foo bar foo bar foo foo'.split(),\n 'B': 'one one two three two two one three'.split(),\n 'C': np.arange(8), 'D': np.arange(8) * 2})\nprint(df)\n# A B C D\n# 0 foo one 0 0\n# 1 bar one 1 2\n# 2 foo two 2 4\n# 3 bar three 3 6\n# 4 foo two 4 8\n# 5 bar two 5 10\n# 6 foo one 6 12\n# 7 foo three 7 14\n\nprint(df.loc[df['A'] == 'foo'])\n\nyields\n A B C D\n0 foo one 0 0\n2 foo two 2 4\n4 foo two 4 8\n6 foo one 6 12\n7 foo three 7 14\n\n\nIf you have multiple values you want to include, put them in a\nlist (or more generally, any iterable) and use isin:\nprint(df.loc[df['B'].isin(['one','three'])])\n\nyields\n A B C D\n0 foo one 0 0\n1 bar one 1 2\n3 bar three 3 6\n6 foo one 6 12\n7 foo three 7 14\n\n\nNote, however, that if you wish to do this many times, it is more efficient to\nmake an index first, and then use df.loc:\ndf = df.set_index(['B'])\nprint(df.loc['one'])\n\nyields\n A C D\nB \none foo 0 0\none bar 1 2\none foo 6 12\n\nor, to include multiple values from the index use df.index.isin:\ndf.loc[df.index.isin(['one','two'])]\n\nyields\n A C D\nB \none foo 0 0\none bar 1 2\ntwo foo 2 4\ntwo foo 4 8\ntwo bar 5 10\none foo 6 12\n\n"} {"input": "Find all occurrences of a substring in Python Python has string.find() and string.rfind() to get the index of a substring in string.\nI wonder, maybe there is something like string.find_all() which can return all founded indexes (not only first from beginning or first from end)?\nFor example:\nstring = \"test test test test\"\n\nprint string.find('test') # 0\nprint string.rfind('test') # 15\n\n#that's the goal\nprint string.find_all('test') # [0,5,10,15]\n\n", "output": "There is no simple built-in string function that does what you're looking for, but you could use the more powerful regular expressions:\n>>> [m.start() for m in re.finditer('test', 'test test test test')]\n[0, 5, 10, 15]\n\nIf you want to find overlapping matches, lookahead will do that:\n>>> [m.start() for m in re.finditer('(?=tt)', 'ttt')]\n[0, 1]\n\nIf you want a reverse find-all without overlaps, you can combine positive and negative lookahead into an expression like this:\n>>> search = 'tt'\n>>> [m.start() for m in re.finditer('(?=%s)(?!.{1,%d}%s)' % (search, len(search)-1, search), 'ttt')]\n[1]\n\nre.finditer returns a generator, so you could change the [] in the above to () to get a generator instead of a list which will be more efficient if you're only iterating through the results once.\n"} {"input": "Checking whether a string starts with XXXX I would like to know how to check whether a string starts with \"hello\" in Python.\nIn Bash I usually do:\nif [[ \"$string\" =~ ^hello ]]; then\n do something here\nfi\n\nHow do I achieve the same in Python?\n", "output": "aString = \"hello world\"\naString.startswith(\"hello\")\n\n"} {"input": "Remove all occurrences of a value from a Python list In Python remove() will remove the first occurrence of value in a list.\nHow to remove all occurrences of a value from a list, without sorting the list?\nThis is what I have in mind.\n>>> x = [1, 2, 3, 4, 2, 2, 3]\n>>> def remove_values_from_list(the_list, val):\n while val in the_list:\n the_list.remove(val)\n>>> remove_values_from_list(x, 2)\n>>> x\n[1, 3, 4, 3]\n\n", "output": "Functional approach:\n2.x\n>>> x = [1,2,3,2,2,2,3,4]\n>>> filter(lambda a: a != 2, x)\n[1, 3, 3, 4]\n\n3.x\n>>> list(filter((2).__ne__, x))\n[1, 3, 3, 4]\n\n"} {"input": "Python and pip, list all versions of a package that's available? Given the name of a Python (2.X) package that can be installed with pip and virtualenv, is there any way to find out a list of all the possible versions of it that pip could install? Right now it's trial and error.\nI'm trying to install a version for a third party library, but the newest version is too new, there were backwards incompatible changes made. So I'd like to somehow have a list of all the versions that pip knows about, so that I can test them.\n", "output": "Without actually having to download or install any additional packages you can use the syntax for specifying a particular version while not actually specifying any version, and the available versions will be printed:\n$ pip install pylibmc==\nCollecting pylibmc==\n Could not find a version that satisfies the requirement pylibmc== (from v\n ersions: 0.2, 0.3, 0.4, 0.5.1, 0.5.2, 0.5.3, 0.5.4, 0.5.5, 0.5, 0.6.1, 0.6, \n 0.7.1, 0.7.2, 0.7.3, 0.7.4, 0.7, 0.8.1, 0.8.2, 0.8, 0.9.1, 0.9.2, 0.9, \n 1.0-alpha, 1.0-beta, 1.0, 1.1.1, 1.1, 1.2.0, 1.2.1, 1.2.2, 1.2.3, 1.3.0)\nNo matching distribution found for pylibmc==\n\nupdate: I had changed my answer to have nothing after the ==, this however doesn't work for older versions of pip (those of you should probably update pip) add any string after the == that is not likely to be an install candidate:\npip install pylibmc==blork\n\n"} {"input": "Strip HTML from strings in Python from mechanize import Browser\nbr = Browser()\nbr.open('http://somewebpage')\nhtml = br.response().readlines()\nfor line in html:\n print line\n\nWhen printing a line in an HTML file, I'm trying to find a way to only show the contents of each HTML element and not the formatting itself. If it finds 'some text', it will only print 'some text', 'hello' prints 'hello', etc. How would one go about doing this?\n", "output": "I always used this function to strip HTML tags, as it requires only the Python stdlib:\nOn Python 2\nfrom HTMLParser import HTMLParser\n\nclass MLStripper(HTMLParser):\n def __init__(self):\n self.reset()\n self.fed = []\n def handle_data(self, d):\n self.fed.append(d)\n def get_data(self):\n return ''.join(self.fed)\n\ndef strip_tags(html):\n s = MLStripper()\n s.feed(html)\n return s.get_data()\n\nFor Python 3\nfrom html.parser import HTMLParser\n\nclass MLStripper(HTMLParser):\n def __init__(self):\n self.reset()\n self.strict = False\n self.convert_charrefs= True\n self.fed = []\n def handle_data(self, d):\n self.fed.append(d)\n def get_data(self):\n return ''.join(self.fed)\n\ndef strip_tags(html):\n s = MLStripper()\n s.feed(html)\n return s.get_data()\n\nNote: this works only for 3.1. For 3.2 or above, you need to call the parent class's init function. See Using HTMLParser in Python 3.2\n"} {"input": "Is there any way to do HTTP PUT in python I need to upload some data to a server using HTTP PUT in python. From my brief reading of the urllib2 docs, it only does HTTP POST. Is there any way to do an HTTP PUT in python?\n", "output": "I've used a variety of python HTTP libs in the past, and I've settled on 'Requests' as my favourite. Existing libs had pretty useable interfaces, but code can end up being a few lines too long for simple operations. A basic PUT in requests looks like:\npayload = {'username': 'bob', 'email': 'bob@bob.com'}\n>>> r = requests.put(\"http://somedomain.org/endpoint\", data=payload)\n\nYou can then check the response status code with:\nr.status_code\n\nor the response with:\nr.content\n\nRequests has a lot synactic sugar and shortcuts that'll make your life easier.\n"} {"input": "Python Dictionary Comprehension Is it possible to create a dictionary comprehension in Python (for the keys)?\nWithout list comprehensions, you can use something like this:\nl = []\nfor n in range(1, 11):\n l.append(n)\n\nWe can shorten this to a list comprehension: l = [n for n in range(1, 11)].\nHowever, say I want to set a dictionary's keys to the same value.\nI can do:\nd = {}\nfor n in range(1, 11):\n d[n] = True # same value for each\n\nI've tried this:\nd = {}\nd[i for i in range(1, 11)] = True\n\nHowever, I get a SyntaxError on the for.\nIn addition (I don't need this part, but just wondering), can you set a dictionary's keys to a bunch of different values, like this:\nd = {}\nfor n in range(1, 11):\n d[n] = n\n\nIs this possible with a dictionary comprehension?\nd = {}\nd[i for i in range(1, 11)] = [x for x in range(1, 11)]\n\nThis also raises a SyntaxError on the for.\n", "output": "There are dictionary comprehensions in Python 2.7+, but they don't work quite the way you're trying. Like a list comprehension, they create a new dictionary; you can't use them to add keys to an existing dictionary. Also, you have to specify the keys and values, although of course you can specify a dummy value if you like.\n>>> d = {n: n**2 for n in range(5)}\n>>> print d\n{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}\n\nIf you want to set them all to True:\n>>> d = {n: True for n in range(5)}\n>>> print d\n{0: True, 1: True, 2: True, 3: True, 4: True}\n\nWhat you seem to be asking for is a way to set multiple keys at once on an existing dictionary. There's no direct shortcut for that. You can either loop like you already showed, or you could use a dictionary comprehension to create a new dict with the new values, and then do oldDict.update(newDict) to merge the new values into the old dict.\n"} {"input": "Does Python support short-circuiting? Does Python support short-circuiting in boolean expressions?\n", "output": "Yep, both and and or operators short-circuit -- see the docs.\n"} {"input": "How do you round UP a number in Python? This problem is killing me. How does one roundup a number UP in Python?\nI tried round(number) but it round the number down. Example:\nround(2.3) = 2.0 and not 3, what I would like\n\nThe I tried int(number + .5) but it round the number down again! Example:\nint(2.3 + .5) = 2\n\nThen I tried round(number + .5) but it won't work in edge cases. Example:\nWAIT! THIS WORKED!\n\nPlease advise.\n", "output": "The ceil (ceiling) function:\nimport math\nprint math.ceil(4.2)\n\n"} {"input": "How can I selectively escape percent (%) in Python strings? I have the following code\ntest = \"have it break.\"\nselectiveEscape = \"Print percent % in sentence and not %s\" % test\n\nprint(selectiveEscape)\n\nI would like to get the output:\nPrint percent % in sentence and not have it break.\n\nWhat actually happens:\n selectiveEscape = \"Use percent % in sentence and not %s\" % test\nTypeError: %d format: a number is required, not str\n\n", "output": ">>> test = \"have it break.\"\n>>> selectiveEscape = \"Print percent %% in sentence and not %s\" % test\n>>> print selectiveEscape\nPrint percent % in sentence and not have it break.\n\n"} {"input": "Passing a dictionary to a function in python as keyword parameters I'd like to call a function in python using a dictionary.\nHere is some code:\nd = dict(param='test')\n\ndef f(param):\n print param\n\nf(d)\n\nThis prints {'param': 'test'} but I'd like it to just print test.\nI'd like it to work similarly for more parameters:\nd = dict(p1=1, p2=2)\ndef f2(p1,p2):\n print p1, p2\nf2(d)\n\nIs this possible?\n", "output": "Figured it out for myself in the end. It is simple, I was just missing the ** operator to unpack the dictionary\nSo my example becomes:\nd = dict(p1=1, p2=2)\ndef f2(p1,p2):\n print p1, p2\nf2(**d)\n\n"} {"input": "Text Progress Bar in the Console Is there a good way to do the following?\nI wrote a simple console app to upload and download files from an FTP server using the ftplib.\nEach time some data chunks are downloaded, I want to update a text progress bar, even if it's just a number.\nBut I don't want to erase all the text that's been printed to the console. (Doing a \"clear\" and then printing the updated percentage.)\n", "output": "Writing '\\r' will move the cursor back to the beginning of the line.\nThis displays a percentage counter:\nimport time\nimport sys\n\nfor i in range(100):\n time.sleep(1)\n sys.stdout.write(\"\\r%d%%\" % i)\n sys.stdout.flush()\n\n"} {"input": "Is it possible to run a Python script as a service in Windows? If possible, how? I am sketching the architecture for a set of programs that share various interrelated objects stored in a database. I want one of the programs to act as a service which provides a higher level interface for operations on these objects, and the other programs to access the objects through that service.\nI am currently aiming for Python and the Django framework as the technologies to implement that service with. I'm pretty sure I figure how to demonize the Python program in Linux. However, it is an optional spec item that the system should support Windows. I have little experience with Windows programming and no experience at all with Windows services.\nIs it possible to run a Python programs as a Windows service (i. e. run it automatically without user login)? I won't necessarily have to implement this part, but I need a rough idea how it would be done in order to decide whether to design along these lines.\nEdit: Thanks for all the answers so far, they are quite comprehensive. I would like to know one more thing: How is Windows aware of my service? Can I manage it with the native Windows utilities? Basically, what is the equivalent of putting a start/stop script in /etc/init.d?\n", "output": "Yes you can. I do it using the pythoncom libraries that come included with ActivePython or can be installed with pywin32 (Python for Windows extensions).\nThis is a basic skeleton for a simple service:\nimport win32serviceutil\nimport win32service\nimport win32event\nimport servicemanager\nimport socket\n\n\nclass AppServerSvc (win32serviceutil.ServiceFramework):\n _svc_name_ = \"TestService\"\n _svc_display_name_ = \"Test Service\"\n\n def __init__(self,args):\n win32serviceutil.ServiceFramework.__init__(self,args)\n self.hWaitStop = win32event.CreateEvent(None,0,0,None)\n socket.setdefaulttimeout(60)\n\n def SvcStop(self):\n self.ReportServiceStatus(win32service.SERVICE_STOP_PENDING)\n win32event.SetEvent(self.hWaitStop)\n\n def SvcDoRun(self):\n servicemanager.LogMsg(servicemanager.EVENTLOG_INFORMATION_TYPE,\n servicemanager.PYS_SERVICE_STARTED,\n (self._svc_name_,''))\n self.main()\n\n def main(self):\n pass\n\nif __name__ == '__main__':\n win32serviceutil.HandleCommandLine(AppServerSvc)\n\nYour code would go in the main() method, usually with some kind of infinite loop that might be interrumped by checking a flag, that you set in the SvcStop method\n"} {"input": "What is the difference between an expression and a statement in Python? In Python, what is the difference between expressions and statements?\n", "output": "Expressions only contain identifiers, literals and operators, where operators include arithmetic and boolean operators, the function call operator () the subscription operator [] and similar, and can be reduced to some kind of \"value\", which can be any Python object. Examples:\n3 + 5\nmap(lambda x: x*x, range(10))\n[a.x for a in some_iterable]\nyield 7\n\nStatements (see 1, 2), on the other hand, are everything that can make up a line (or several lines) of Python code. Note that expressions are statements as well. Examples:\n# all the above expressions\nprint 42\nif x: do_y()\nreturn\na = 7\n\n"} {"input": "Django gives Bad Request (400) when DEBUG = False I am new to django-1.6. When I run the django server with DEBUG = True, it's running perfectly. But when I change DEBUG to False in the settings file, then the server stopped and it gives the following error on the command prompt:\nCommandError: You must set settings.ALLOWED_HOSTS if DEBUG is False.\n\nAfter I changed ALLOWED_HOSTS to [\"http://127.0.0.1:8000\",], in the browser I get the error:\nBad Request (400)\n\nIs it possible to run Django without debug mode?\n", "output": "The ALLOWED_HOSTS list should contain fully qualified host names, not urls. Leave out the port and the protocol. If you are using 127.0.0.1, I would add localhost to the list too:\nALLOWED_HOSTS = ['127.0.0.1', 'localhost']\n\nYou could also use * to match any host:\nALLOWED_HOSTS = ['*']\n\nQuoting the documentation:\n\nValues in this list can be fully qualified names (e.g. 'www.example.com'), in which case they will be matched against the request\u00e2\u0080\u0099s Host header exactly (case-insensitive, not including port). A value beginning with a period can be used as a subdomain wildcard: '.example.com' will match example.com, www.example.com, and any other subdomain of example.com. A value of '*' will match anything; in this case you are responsible to provide your own validation of the Host header (perhaps in a middleware; if so this middleware must be listed first in MIDDLEWARE_CLASSES).\n\nBold emphasis mine.\nThe status 400 response you get is due to a SuspiciousOperation exception being raised when your host header doesn't match any values in that list.\n"} {"input": "listing all functions in a python module I have a python module installed on my system and I'd like to be able to see what functions/classes/methods are available in it. \nI want to call the doc function on each one. In ruby I can do something like ClassName.methods to get a list of all the methods available on that class. Is there something similar in python?\neg. something like:\nfrom somemodule import foo\nprint foo.methods # or whatever is the correct method to call\n\n", "output": "You can use dir(module) to see all available methods/attributes. Also check out PyDocs.\n"} {"input": "When is del useful in python? I can't really think of any reason why python needs the del keyword (and most languages seem to not have a similar keyword). For instance, rather than deleting a variable, one could just assign None to it. And when deleting from a dictionary, a del method could be added.\nIs there any reason to keep del in python, or is it a vestige of Python's pre-garbage collection days?\n", "output": "Firstly, you can del other things besides local variables\ndel list_item[4]\ndel dictionary[\"alpha\"]\n\nBoth of which should be clearly useful. Secondly, using del on a local variable makes the intent clearer. Compare:\n del foo\n\nto\n foo = None\n\nI know in the case of del foo that the intent is to remove the variable from scope. It's not clear that foo = None is doing that. If somebody just assigned foo = None I might think it was dead code. But I instantly know what somebody who codes del foo was trying to do.\n"} {"input": "Permanently add a directory to PYTHONPATH Whenever I use sys.path.append, the new directory will be added. However, once I close python, the list will revert to the previous (default?) values. How do I permanently add a directory to PYTHONPATH?\n", "output": "If you're using bash (on a Mac or GNU/Linux distro), add this to your ~/.bashrc\nexport PYTHONPATH=\"${PYTHONPATH}:/my/other/path\"\n\n"} {"input": "Seeking clarification on apparent contradictions regarding weakly typed languages I think I understand strong typing, but every time I look for examples for what is weak typing I end up finding examples of programming languages that simply coerce/convert types automatically.\nFor instance, in this article named Typing: Strong vs. Weak, Static vs. Dynamic says that Python is strongly typed because you get an exception if you try to:\nPython\n1 + \"1\"\nTraceback (most recent call last):\nFile \"\", line 1, in ? \nTypeError: unsupported operand type(s) for +: 'int' and 'str'\n\nHowever, such thing is possible in Java and in C#, and we do not consider them weakly typed just for that.\nJava\n int a = 10;\n String b = \"b\";\n String result = a + b;\n System.out.println(result);\n\nC#\nint a = 10;\nstring b = \"b\";\nstring c = a + b;\nConsole.WriteLine(c);\n\nIn this another article named Weakly Type Languages the author says that Perl is weakly typed simply because I can concatenate a string to a number and viceversa without any explicit conversion.\nPerl\n$a=10;\n$b=\"a\";\n$c=$a.$b;\nprint $c; #10a\n\nSo the same example makes Perl weakly typed, but not Java and C#?.\nGee, this is confusing \nThe authors seem to imply that a language that prevents the application of certain operations on values of different types is strongly typed and the contrary means weakly typed.\nTherefore, at some point I have felt prompted to believe that if a language provides a lot of automatic conversions or coercion between types (as perl) may end up being considered weakly typed, whereas other languages that provide only a few conversions may end up being considered strongly typed. \nI am inclined to believe, though, that I must be wrong in this interepretation, I just do not know why or how to explain it.\nSo, my questions are: \n\nWhat does it really mean for a language to be truly weakly typed? \nCould you mention any good examples of weakly typing that are not related to automatic conversion/automatic coercion done by the language?\nCan a language be weakly typed and strongly typed at the same time?\n\n", "output": "UPDATE: This question was the subject of my blog on the 15th of October, 2012. Thanks for the great question!\n\n\nWhat does it really mean for a language to be \"weakly typed\"? \n\nIt means \"this language uses a type system that I find distasteful\". A \"strongly typed\" language by contrast is a language with a type system that I find pleasant.\nThe terms are essentially meaningless and you should avoid them. Wikipedia lists eleven different meanings for \"strongly typed\", several of which are contradictory. This indicates that the odds of confusion being created are high in any conversation involving the term \"strongly typed\" or \"weakly typed\". \nAll that you can really say with any certainty is that a \"strongly typed\" language under discussion has some additional restriction in the type system, either at runtime or compile time, that a \"weakly typed\" language under discussion lacks. What that restriction might be cannot be determined without further context.\nInstead of using \"strongly typed\" and \"weakly typed\", you should describe in detail what kind of type safety you mean. For example, C# is a statically typed language and a type safe language and a memory safe language, for the most part. C# allows all three of those forms of \"strong\" typing to be violated. The cast operator violates static typing; it says to the compiler \"I know more about the runtime type of this expression than you do\". If the developer is wrong, then the runtime will throw an exception in order to protect type safety. If the developer wishes to break type safety or memory safety, they can do so by turning off the type safety system by making an \"unsafe\" block. In an unsafe block you can use pointer magic to treat an int as a float (violating type safety) or to write to memory you do not own. (Violating memory safety.)\nC# imposes type restrictions that are checked at both compile-time and at runtime, thereby making it a \"strongly typed\" language compared to languages that do less compile-time checking or less runtime checking. C# also allows you to in special circumstances do an end-run around those restrictions, making it a \"weakly typed\" language compared with languages which do not allow you to do such an end-run. \nWhich is it really? It is impossible to say; it depends on the point of view of the speaker and their attitude towards the various language features. \n"} {"input": "Web scraping with Python I'd like to grab daily sunrise/sunset times from a web site. Is it possible to scrape web content with Python? what are the modules used? Is there any tutorial available?\n", "output": "Use urllib2 in combination with the brilliant BeautifulSoup library:\nimport urllib2\nfrom BeautifulSoup import BeautifulSoup\n# or if you're using BeautifulSoup4:\n# from bs4 import BeautifulSoup\n\nsoup = BeautifulSoup(urllib2.urlopen('http://example.com').read())\n\nfor row in soup('table', {'class': 'spad'})[0].tbody('tr'):\n tds = row('td')\n print tds[0].string, tds[1].string\n # will print date and sunrise\n\n"} {"input": "How to use Xpath in Python? What is the library? Is there a full implementation? How is the library used? Where is its website?\n", "output": "libxml2 has a number of advantages:\n\nCompliance to the spec\nActive development and a community participation \nSpeed. This is really a python wrapper around a C implementation. \nUbiquity. The libxml2 library is pervasive and thus well tested.\n\nDownsides include:\n\nCompliance to the spec. It's strict. Things like default namespace handling are easier in other libraries.\nUse of native code. This can be a pain depending on your how your application is distributed / deployed. RPMs are available that ease some of this pain.\nManual resource handling. Note in the sample below the calls to freeDoc() and xpathFreeContext(). This is not very Pythonic.\n\nIf you are doing simple path selection, stick with ElementTree ( which is included in Python 2.5 ). If you need full spec compliance or raw speed and can cope with the distribution of native code, go with libxml2.\nSample of libxml2 XPath Use\n\nimport libxml2\n\ndoc = libxml2.parseFile(\"tst.xml\")\nctxt = doc.xpathNewContext()\nres = ctxt.xpathEval(\"//*\")\nif len(res) != 2:\n print \"xpath query: wrong node set size\"\n sys.exit(1)\nif res[0].name != \"doc\" or res[1].name != \"foo\":\n print \"xpath query: wrong node set value\"\n sys.exit(1)\ndoc.freeDoc()\nctxt.xpathFreeContext()\n\nSample of ElementTree XPath Use\n\nfrom elementtree.ElementTree import ElementTree\nmydoc = ElementTree(file='tst.xml')\nfor e in mydoc.findall('/foo/bar'):\n print e.get('title').text\n\n"} {"input": "Expanding tuples into arguments Is there a way to expand a Python tuple into a function - as actual parameters?\nFor example, here expand() does the magic:\ntuple = (1, \"foo\", \"bar\")\n\ndef myfun(number, str1, str2):\n return (number * 2, str1 + str2, str2 + str1)\n\nmyfun(expand(tuple)) # (2, \"foobar\", \"barfoo\")\n\nI know one could define myfun as myfun((a, b, c)), but of course there may be legacy code.\nThanks\n", "output": "myfun(*tuple) does exactly what you request.\nSide issue: don't use as your identifiers builtin type names such as tuple, list, file, set, and so forth -- it's horrible practice and it will come back and byte you when you least expect it,\nso just get into the habit of actively avoiding hiding builtin names with your own identifiers.\n"} {"input": "Is there a short contains function for lists? I see people are using any to gather another list to see if an item exists in a list, but is there a quick way to just do?:\nif list.contains(myItem):\n # do something\n\n", "output": "You can use this syntax:\nif myItem in list:\n # do something\n\nAlso, inverse operator:\nif myItem not in list:\n # do something\n\nIt's work fine for lists, tuples, sets and dicts (check keys).\nNote that this is an O(n) operation in lists and tuples, but an O(1) operation in sets and dicts.\n"} {"input": "Finding the average of a list I have to find the average of a list in Python. This is my code so far\nl = [15, 18, 2, 36, 12, 78, 5, 6, 9]\nprint reduce(lambda x, y: x + y, l)\n\nI've got it so it adds together the values in the list, but I don't know how to make it divide into them?\n", "output": "l = [15, 18, 2, 36, 12, 78, 5, 6, 9]\nsum(l) / float(len(l))\n\n"} {"input": "Split a string by spaces -- preserving quoted substrings -- in Python I have a string which is like this:\nthis is \"a test\"\n\nI'm trying to write something in Python to split it up by space while ignoring spaces within quotes. The result I'm looking for is:\n['this','is','a test']\n\nPS. I know you are going to ask \"what happens if there are quotes within the quotes, well, in my application, that will never happen.\n", "output": "You want split, from the shlex module.\n>>> import shlex\n>>> shlex.split('this is \"a test\"')\n['this', 'is', 'a test']\n\nThis should do exactly what you want.\n"} {"input": "How can I convert a datetime object to milliseconds since epoch (unix time) in Python? I have a Python datetime object that I want to convert to unix time, or seconds/milliseconds since the 1970 epoch.\nHow do I do this?\n", "output": "It appears to me that the simplest way to do this is\nimport datetime\n\nepoch = datetime.datetime.utcfromtimestamp(0)\n\ndef unix_time_millis(dt):\n return (dt - epoch).total_seconds() * 1000.0\n\n"} {"input": "How to change the font size on a matplotlib plot How does one change the font size for all elements (ticks, labels, title) on a matplotlib plot?\nI know how to change the tick label sizes, this is done with:\nimport matplotlib \nmatplotlib.rc('xtick', labelsize=20) \nmatplotlib.rc('ytick', labelsize=20) \n\nBut how does one change the rest?\n", "output": "From the matplotlib documentation, \nfont = {'family' : 'normal',\n 'weight' : 'bold',\n 'size' : 22}\n\nmatplotlib.rc('font', **font)\n\nThis sets the font of all items to the font specified by the kwargs object, font.\nAlternatively, you could also use the rcParams update method as suggested in this answer:\nmatplotlib.rcParams.update({'font.size': 22})\n\nYou can find a full list of available properties on the Customizing matplotlib page.\n"} {"input": "Check if a given key already exists in a dictionary and increment it Given a dictionary, how can I find out if a given key in that dictionary has already been set to a non-None value?\nI.e., I want to do this:\nmy_dict = {}\n\nif (my_dict[key] != None):\n my_dict[key] = 1\nelse:\n my_dict[key] += 1\n\nI.e., I want to increment the value if there's already one there, or set it to 1 otherwise.\n", "output": "You are looking for collections.defaultdict (available for Python 2.5+). This\nfrom collections import defaultdict\n\nmy_dict = defaultdict(int)\nmy_dict[key] += 1\n\nwill do what you want.\nFor regular Python dicts, if there is no value for a given key, you will not get None when accessing the dict -- a KeyError will be raised. So if you want to use a regular dict, instead of your code you would use\nif key in my_dict:\n my_dict[key] += 1\nelse:\n my_dict[key] = 1\n\n"} {"input": "Does Python SciPy need BLAS? numpy.distutils.system_info.BlasNotFoundError: \n Blas (http://www.netlib.org/blas/) libraries not found.\n Directories to search for the libraries can be specified in the\n numpy/distutils/site.cfg file (section [blas]) or by setting\n the BLAS environment variable.\n\nWhich tar do I need to download off this site?\nI've tried the fortrans, but I keep getting this error (after setting the environment variable obviously).\n", "output": "If you need to use the latest versions of SciPy rather than the packaged version, without going through the hassle of building BLAS and LAPACK, you can follow the below procedure.\nInstall linear algebra libraries from repository (for Ubuntu),\nsudo apt-get install gfortran libopenblas-dev liblapack-dev\n\nThen install SciPy, (after downloading the SciPy source): python setup.py install or\npip install scipy\n\nAs the case may be.\n"} {"input": "Return None if Dictionary key is not available I need a way to get a dictionary value if its key exists, or simply return None, if it does not.\nHowever, Python returns a key_error if you search for a key that does not exist. I know that I can check for the key, but I am looking for something more explicit. Is there a way to just return None if the key does not exist ?\n", "output": "You can use get()\nvalue = d.get(key)\n\nwhich will return None if key is not in d. You can also provide a different default value that will be returned instead of None:\nvalue = d.get(key, \"empty\")\n\n"} {"input": "How to query as GROUP BY in django? I query a model, \nMembers.objects.all()\n\nand it returns say\nEric, Salesman, X-Shop\nFreddie, Manager, X2-Shop\nTeddy, Salesman, X2-Shop\nSean, Manager, X2-Shop\n\nWhat i want is, to know the best Django way to fire\na group_by query to my db, as like,\nMembers.objects.all().group_by('designation')\n\nWhich doesn't work of course.\nI know we can do some tricks on \"django/db/models/query.py\", \nbut i am just curious to know how to do it without patching.\n", "output": "If you mean to do aggregation and are using Django 1.1 (currently in alpha 1), you can use the new aggregation features of the ORM:\nfrom django.db.models import Count\nMembers.objects.values('designation').annotate(dcount=Count('designation'))\n\nThis results in a query similar to\nSELECT designation, COUNT(designation) AS dcount\nFROM members GROUP BY designation\n\nand the output would be of the form\n[{'designation': 'Salesman', 'dcount': 2}, \n {'designation': 'Manager', 'dcount': 2}]\n\n"} {"input": "Pandas: change data type of columns I want to convert a table, represented as a list of lists, into a Pandas DataFrame. As an extremely simplified example:\na = [['a', '1.2', '4.2'], ['b', '70', '0.03'], ['x', '5', '0']]\ndf = pd.DataFrame(a)\n\nWhat is the best way to convert the columns to the appropriate types, in this case columns 2 and 3 into floats? Is there a way to specify the types while converting to DataFrame? Or is it better to create the DataFrame first and then loop through the columns to change the type for each column? Ideally I would like to do this in a dynamic way because there can be hundreds of columns and I don't want to specify exactly which columns are of which type. All I can guarantee is that each columns contains values of the same type.\n", "output": "How about this? \na = [['a', '1.2', '4.2'], ['b', '70', '0.03'], ['x', '5', '0']]\ndf = pd.DataFrame(a, columns=['one', 'two', 'three'])\ndf\nOut[16]: \n one two three\n0 a 1.2 4.2\n1 b 70 0.03\n2 x 5 0\n\ndf.dtypes\nOut[17]: \none object\ntwo object\nthree object\n\ndf[['two', 'three']] = df[['two', 'three']].astype(float)\n\ndf.dtypes\nOut[19]: \none object\ntwo float64\nthree float64\n\n"} {"input": "How do I get list of methods in a Python class? I want to iterate through the methods in a class, or handle class or instance objects differently based on the methods present. How do I get a list of class methods?\nAlso see:\n\nHow can I list the methods in a\nPython 2.5 module? \nLooping over\na Python / IronPython Object\nMethods \nFinding the methods an\nobject has \nHow do I look inside\na Python object? \nHow Do I\nPerform Introspection on an Object in\nPython 2.x? \nHow to get a\ncomplete list of object\u00e2\u0080\u0099s methods and\nattributes? \nFinding out which\nfunctions are available from a class\ninstance in python?\n\n", "output": "An example (listing the methods of the optparse.OptionParser class):\n>>> from optparse import OptionParser\n>>> import inspect\n>>> inspect.getmembers(OptionParser, predicate=inspect.ismethod)\n[([('__init__', ),\n...\n ('add_option', ),\n ('add_option_group', ),\n ('add_options', ),\n ('check_values', ),\n ('destroy', ),\n ('disable_interspersed_args',\n ),\n ('enable_interspersed_args',\n ),\n ('error', ),\n ('exit', ),\n ('expand_prog_name', ),\n ...\n ]\n\nNotice that getmembers returns a list of 2-tuples. The first item is the name of the member, the second item is the value.\nYou can also pass an instance to getmembers:\n>>> parser = OptionParser()\n>>> inspect.getmembers(parser, predicate=inspect.ismethod)\n...\n\n"} {"input": "How to use timeit module I understand the concept of what timeit does but I am not sure how to implement it in my code.\nHow can I compare two functions, say insertion_sort and tim_sort, with timeit?\n", "output": "If you want to use timeit in an interactive Python session, there are two convenient options:\n\nUse the IPython shell. It features the convenient %timeit special function:\nIn [1]: def f(x):\n ...: return x*x\n ...: \n\nIn [2]: %timeit for x in range(100): f(x)\n100000 loops, best of 3: 20.3 us per loop\n\nIn a standard Python interpreter, you can access functions and other names you defined earlier during the interactive session by importing them from __main__ in the setup statement:\n>>> def f(x):\n... return x * x \n... \n>>> import timeit\n>>> timeit.repeat(\"for x in range(100): f(x)\", \"from __main__ import f\",\n number=100000)\n[2.0640320777893066, 2.0876040458679199, 2.0520210266113281]\n\n\n"} {"input": "How can I convert a character to a integer in Python, and viceversa? I want to get, given a character, its ASCII value.\nFor example, for the character a, I want to get 97, and vice versa.\n", "output": "Use chr() and ord():\n>>> chr(97)\n'a'\n>>> ord('a')\n97\n\n"} {"input": "What is the easiest way to remove all packages installed by pip? I'm trying to fix up one of my virtualenvs - I'd like to reset all of the installed libraries back to the ones that match production.\nIs there a quick and easy way to do this with pip?\n", "output": "I've found this snippet as an alternative solution. It's a more graceful removal of libraries than remaking the virtualenv:\npip freeze | xargs pip uninstall -y\n\n\nIn case you have packages installed via VCS, you need to exclude those lines and remove the packages manually (elevated from the comments below):\npip freeze | grep -v \"^-e\" | xargs pip uninstall -y\n\n"} {"input": "Python pip install fails: invalid command egg_info I find that recently often when I try to install a Python package using pip, I get the error(s) below.\nI found a reference online that one has to use \"python2 setup.py install\" from the download directory, and indeed find that this will then work if I manually find and download the package (from pypi).\nBut, I don't know where pip is downloading packages to, and/or why it is failing in this manner.\nI tried to do a pip upgrade, but it also failed in a similar manner, with a bunch of \"Unknown distribution option\" errors (entry_points, zip_safe, test_suite, tests_require)!\n\npip 1.0.1\nActivePython 2.7\n\nTrying to use ActiveState's pypm fails, because they have a smaller library base, and it doesn't include these packages.\nC:\\test>pip install requests-oauth\nDownloading/unpacking requests-oauth\n Downloading requests-oauth-0.4.1.tar.gz\n Running setup.py egg_info for package requests-oauth\n E:\\Plang\\ActivePython\\lib\\distutils\\dist.py:267: UserWarning: Unknown distribution option: 'zip_safe'\n warnings.warn(msg)\n E:\\Plang\\ActivePython\\lib\\distutils\\dist.py:267: UserWarning: Unknown distribution option: 'install_requires'\n warnings.warn(msg)\n usage: -c [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]\n or: -c --help [cmd1 cmd2 ...]\n or: -c --help-commands\n or: -c cmd --help\n\n error: invalid command 'egg_info'\n Complete output from command python setup.py egg_info:\n E:\\Plang\\ActivePython\\lib\\distutils\\dist.py:267: UserWarning: Unknown distribution option: 'zip_safe'\n warnings.warn(msg)\n\nE:\\Plang\\ActivePython\\lib\\distutils\\dist.py:267: UserWarning: Unknown distribution option: 'install_requires'\n warnings.warn(msg)\n\nusage: -c [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]\n or: -c --help [cmd1 cmd2 ...]\n or: -c --help-commands\n or: -c cmd --help\n\nerror: invalid command 'egg_info'\n\n", "output": "Install distribute, which comes with egg_info.\nShould be as simple as pip install Distribute.\nDistribute has been merged into Setuptools as of version 0.7. If you are using a version <=0.6, upgrade using pip install --upgrade setuptools or easy_install -U setuptools.\n"} {"input": "When is \"i += x\" different from \"i = i + x\" in Python? I was told that += can have different effects than the standard notation of i = i +. Is there a case in which i += 1 would be different from i = i + 1?\n", "output": "This depends entirely on the object i. \n+= calls the __iadd__ method (if it exists -- falling back on __add__ if it doesn't exist) whereas + calls the __add__ method1. \nFrom an API perspective, __iadd__ is supposed to be used for modifying mutable objects in place (returning the object which was mutated) whereas __add__ should return a new instance of something. For immutable objects, both methods return a new instance, but __iadd__ will put the new instance in the current namespace with the same name that the old instance had. This is why \ni = 1\ni += 1\n\nseems to increment i. In reality, you get a new integer and assign it \"on top of\" i -- losing one reference to the old integer. In this case, i += 1 is exactly the same as i = i + 1. But, with most mutable objects, it's a different story:\nAs a concrete example:\na = [1, 2, 3]\nb = a\nb += [1, 2, 3]\nprint a #[1, 2, 3, 1, 2, 3]\nprint b #[1, 2, 3, 1, 2, 3]\n\ncompared to:\na = [1, 2, 3]\nb = a\nb = b + [1, 2, 3]\nprint a #[1, 2, 3]\nprint b #[1, 2, 3, 1, 2, 3]\n\nnotice how in the first example, since b and a reference the same object, when I use += on b, it actually changes b (and a sees that change too -- After all, it's referencing the same list). In the second case however, when I do b = b + [1, 2, 3], this takes the list that b is referencing and concatenates it with a new list [1, 2, 3]. It then stores the concatenated list in the current namespace as b -- With no regard for what b was the line before.\n\n1In the expression x + y, if x.__add__ isn't implemented or if x.__add__(y) returns NotImplemented and x and y have different types, then x + y tries to call y.__radd__(x). So, in the case where you have \nfoo_instance += bar_instance\nif Foo doesn't implement __add__ or __iadd__ then the result here is the same as \nfoo_instance = bar_instance.__radd__(bar_instance,foo_instance)\n"} {"input": "How to read csv into record array in numpy? I wonder if there is a direct way to import the contents of a csv file into a record array, much in the way that R's read.table(), read.delim(), and read.csv() family imports data to R's data frame? Or is the best way to use csv.reader() and then apply something like numpy.core.records.fromrecords()?\n", "output": "You can use Numpy's genfromtxt() method to do so, by setting the delimiter kwarg to a comma.\nfrom numpy import genfromtxt\nmy_data = genfromtxt('my_file.csv', delimiter=',')\n\nMore information on the function can be found at its respective documentation.\n"} {"input": "Why compile Python code? Why would you compile a Python script? You can run them directly from the .py file and it works fine, so is there a performance advantage or something? \nI also notice that some files in my application get compiled into .pyc while others do not, why is this?\n", "output": "It's compiled to bytecode which can be used much, much, much faster.\nThe reason some files aren't compiled is that the main script, which you invoke with python main.py is recompiled every time you run the script. All imported scripts will be compiled and stored on the disk.\nImportant addition by Ben Blank:\n\nIt's worth noting that while running a\n compiled script has a faster startup\n time (as it doesn't need to be\n compiled), it doesn't run any\n faster.\n\n"} {"input": "What are the differences between numpy arrays and matrices? Which one should I use? What are the advantages and disadvantages of each?\nFrom what I've seen, either one can work as a replacement for the other if need be, so should I bother using both or should I stick to just one of them?\nWill the style of the program influence my choice? I am doing some machine learning using numpy, so there are indeed lots of matrices, but also lots of vectors (arrays).\n", "output": "Numpy matrices are strictly 2-dimensional, while numpy arrays (ndarrays) are\nN-dimensional. Matrix objects are a subclass of ndarray, so they inherit all\nthe attributes and methods of ndarrays.\nThe main advantage of numpy matrices is that they provide a convenient notation\nfor matrix multiplication: if a and b are matrices, then a*b is their matrix\nproduct.\nimport numpy as np\n\na=np.mat('4 3; 2 1')\nb=np.mat('1 2; 3 4')\nprint(a)\n# [[4 3]\n# [2 1]]\nprint(b)\n# [[1 2]\n# [3 4]]\nprint(a*b)\n# [[13 20]\n# [ 5 8]]\n\nBoth matrix objects and ndarrays have .T to return the transpose, but matrix\nobjects also have .H for the conjugate transpose, and .I for the inverse.\nIn contrast, numpy arrays consistently abide by the rule that operations are\napplied element-wise. Thus, if a and b are numpy arrays, then a*b is the array\nformed by multiplying the components element-wise:\nc=np.array([[4, 3], [2, 1]])\nd=np.array([[1, 2], [3, 4]])\nprint(c*d)\n# [[4 6]\n# [6 4]]\n\nTo obtain the result of matrix multiplication, you use np.dot :\nprint(np.dot(c,d))\n# [[13 20]\n# [ 5 8]]\n\nThe ** operator also behaves differently:\nprint(a**2)\n# [[22 15]\n# [10 7]]\nprint(c**2)\n# [[16 9]\n# [ 4 1]]\n\nSince a is a matrix, a**2 returns the matrix product a*a.\nSince c is an ndarray, c**2 returns an ndarray with each component squared\nelement-wise.\nThere are other technical differences between matrix objects and ndarrays\n(having to do with np.ravel, item selection and sequence behavior).\nThe main advantage of numpy arrays is that they are more general than\n2-dimensional matrices. What happens when you want a 3-dimensional array? Then\nyou have to use an ndarray, not a matrix object. Thus, learning to use matrix\nobjects is more work -- you have to learn matrix object operations, and\nndarray operations.\nWriting a program that uses both matrices and arrays makes your life difficult\nbecause you have to keep track of what type of object your variables are, lest\nmultiplication return something you don't expect.\nIn contrast, if you stick solely with ndarrays, then you can do everything\nmatrix objects can do, and more, except with slightly different\nfunctions/notation.\nIf you are willing to give up the visual appeal of numpy matrix product\nnotation, then I think numpy arrays are definitely the way to go.\nPS. Of course, you really don't have to choose one at the expense of the other,\nsince np.asmatrix and np.asarray allow you to convert one to the other (as\nlong as the array is 2-dimensional).\n"} {"input": "Implement touch using Python? touch is a Unix utility that sets the modification and access times of files to the current time of day. If the file doesn't exist, it is created with default permissions.\nHow would you implement it as a Python function? Try to be cross platform and complete.\n(Current Google results for \"python touch file\" are not that great, but point to os.utime.)\n", "output": "This tries to be a little more race-free than the other solutions. (The with keyword is new in Python 2.5.)\nimport os\ndef touch(fname, times=None):\n with open(fname, 'a'):\n os.utime(fname, times)\n\nRoughly equivalent to this.\nimport os\ndef touch(fname, times=None):\n fhandle = open(fname, 'a')\n try:\n os.utime(fname, times)\n finally:\n fhandle.close()\n\nNow, to really make it race-free, you need to use futimes and change the timestamp of the open filehandle, instead of opening the file and then changing the timestamp on the filename (which may have been renamed). Unfortunately, Python doesn't seem to provide a way to call futimes without going through ctypes or similar...\n\nEDIT\nAs noted by Nate Parsons, Python 3.3 will add specifying a file descriptor (when os.supports_fd) to functions such as os.utime, which will use the futimes syscall instead of the utimes syscall under the hood. In other words:\nimport os\ndef touch(fname, mode=0o666, dir_fd=None, **kwargs):\n flags = os.O_CREAT | os.O_APPEND\n with os.fdopen(os.open(fname, flags=flags, mode=mode, dir_fd=dir_fd)) as f:\n os.utime(f.fileno() if os.utime in os.supports_fd else fname,\n dir_fd=None if os.supports_fd else dir_fd, **kwargs)\n\n"} {"input": "Turn a string into a valid filename in Python I have a string that I want to use as a filename, so I want to remove all characters that wouldn't be allowed in filenames, using Python.\nI'd rather be strict than otherwise, so let's say I want to retain only letters, digits, and a small set of other characters like \"_-.() \". What's the most elegant solution?\nThe filename needs to be valid on multiple operating systems (Windows, Linux and Mac OS) - it's an MP3 file in my library with the song title as the filename, and is shared and backed up between 3 machines.\n", "output": "You can look at the Django framework for how they create a \"slug\" from arbitrary text. A slug is URL- and filename- friendly.\nTheir template/defaultfilters.py (at around line 183) defines a function, slugify, that's probably the gold standard for this kind of thing. Essentially, their code is the following.\ndef slugify(value):\n \"\"\"\n Normalizes string, converts to lowercase, removes non-alpha characters,\n and converts spaces to hyphens.\n \"\"\"\n import unicodedata\n value = unicodedata.normalize('NFKD', value).encode('ascii', 'ignore')\n value = unicode(re.sub('[^\\w\\s-]', '', value).strip().lower())\n value = unicode(re.sub('[-\\s]+', '-', value))\n\nThere's more, but I left it out, since it doesn't address slugification, but escaping.\n"} {"input": "How do I set the figure title and axes labels font size in Matplotlib? I am creating a figure in Matplotlib like this:\nfrom matplotlib import pyplot as plt\n\nfig = plt.figure()\nplt.plot(data)\nfig.suptitle('test title')\nplt.xlabel('xlabel')\nplt.ylabel('ylabel')\nfig.savefig('test.jpg')\n\nI want to specify font sizes for the figure title and the axis labels. I need all three to be different font sizes, so setting a global font size (mpl.rcParams['font.size']=x) is not what I want. How do I set font sizes for the figure title and the axis labels individually?\n", "output": "Functions dealing with text like label, title, etc. accept parameters same as matplotlib.text.Text. For the font size you can use size/fontsize:\nfrom matplotlib import pyplot as plt \n\nfig = plt.figure()\nplt.plot(data)\nfig.suptitle('test title', fontsize=20)\nplt.xlabel('xlabel', fontsize=18)\nplt.ylabel('ylabel', fontsize=16)\nfig.savefig('test.jpg')\n\nFor globally setting title and label sizes, mpl.rcParams contains axes.titlesize and axes.labelsize. (From the page):\naxes.titlesize : large # fontsize of the axes title\naxes.labelsize : medium # fontsize of the x any y labels\n\n(As far as I can see, there is no way to set x and y label sizes separately.)\nAnd I see that axes.titlesize does not affect suptitle. I guess, you need to set that manually.\n"} {"input": "Finding what methods an object has Given a Python object of any kind, is there an easy way to get a list of all methods that this object has?\nOr,\nif this is not possible, is there at least an easy way to check if it has a particular method other than simply checking if an error occurs when the method is called?\n", "output": "It appears you can use this code, replacing 'object' with the object you're interested in:-\n[method for method in dir(object) if callable(getattr(object, method))]\n\nI discovered it at this site, hopefully that should provide some further detail!\n"} {"input": "How do I avoid the \"self.x = x; self.y = y; self.z = z\" pattern in __init__? I see patterns like\ndef __init__(self, x, y, z):\n ...\n self.x = x\n self.y = y\n self.z = z\n ...\n\nquite frequently, often with a lot more parameters. Is there a good way to avoid this type of tedious repetitiveness? Should I inherit from namedtuple?\n", "output": "EDIT\nIt seems that several people are concerned about presenting this solution, so I will provide a very clear disclaimer. You should not use this solution. I only provide it as information, so you know that the language is capable of this. The rest of the answer is just showing language capabilities, not endorsing using them in this way.\nORIGINAL ANSWER\nThere isn't really anything wrong with explicitly copying parameters into attributes. If you have too many parameters in the ctor, it is sometimes considered a code smell and maybe you should group these params into a fewer objects. Other times, it is necessary and there is nothing wrong with it. Anyway, doing it explicitly is the way to go.\nHowever, since you are asking HOW it can be done (and not whether it should be done), then one solution is this:\nclass A:\n def __init__(self, **kwargs):\n for key in kwargs:\n setattr(self, key, kwargs[key])\n\na = A(l=1, d=2)\na.l # will return 1\na.d # will return 2\n\n"} {"input": "Is it a good practice to use try-except-else in Python? From time to time in Python, I see the block:\ntry:\n try_this(whatever)\nexcept SomeException as exception:\n #Handle exception\nelse:\n return something\n\nWhat is the reason for the try-except-else to exist?\nI do not like that kind of programming, as it is using exceptions to perform flow control. However, if it is included in the language, there must be a good reason for it, isn't it?\nIt is my understanding that exceptions are not errors, and that they should only be used for exceptional conditions (e.g. I try to write a file into disk and there is no more space, or maybe I do not have permission), and not for flow control. \nNormally I handle exceptions as:\nsomething = some_default_value\ntry:\n something = try_this(whatever)\nexcept SomeException as exception:\n #Handle exception\nfinally:\n return something\n\nOr if I really do not want to return anything if an exception happens, then:\ntry:\n something = try_this(whatever)\n return something\nexcept SomeException as exception:\n #Handle exception\n\n", "output": "\n\"I do not know if it is out of ignorance, but i do not like that \n kind of programming, as it is using exceptions to perform flow control.\"\n\nIn the Python world, using exceptions for flow control is common and normal.\nEven the Python core developers use exceptions for flow-control and that style is heavily baked into the language (i.e. the iterator protocol uses StopIteration to signal loop termination).\nIn addition, the try-except-style is used to prevent the race-conditions inherent in some of the \"look-before-you-leap\" constructs. For example, testing os.path.exists results in information that may be out-of-date by the time you use it. Likewise, Queue.full returns information that may be stale. The try-except-else style will produce more reliable code in these cases.\n\n\"It my understanding that exceptions are not errors, they should only\n be used for exceptional conditions\"\n\nIn some other languages, that rule reflects their cultural norms as reflected in their libraries. The \"rule\" is also based in-part on performance considerations for those languages.\nThe Python cultural norm is somewhat different. In many cases, you must use exceptions for control-flow. Also, the use of exceptions in Python does not slow the surrounding code and calling code as it does in some compiled languages (i.e. CPython already implements code for exception checking at every step, regardless of whether you actually use exceptions or not).\nIn other words, your understanding that \"exceptions are for the exceptional\" is a rule that makes sense in some other languages, but not for Python.\n\n\"However, if it is included in the language itself, there must be a\n good reason for it, isn't it?\"\n\nBesides helping to avoid race-conditions, exceptions are also very useful for pulling error-handling outside loops. This is a necessary optimization in interpreted languages which do not tend to have automatic loop invariant code motion.\nAlso, exceptions can simplify code quite a bit in common situations where the ability to handle an issue is far removed from where the issue arose. For example, it is common to have top level user-interface code calling code for business logic which in turn calls low-level routines. Situations arising in the low-level routines (such as duplicate records for unique keys in database accesses) can only be handled in top-level code (such as asking the user for a new key that doesn't conflict with existing keys). The use of exceptions for this kind of control-flow allows the mid-level routines to completely ignore the issue and be nicely decoupled from that aspect of flow-control.\nThere is a nice blog post on the indispensibility of exceptions here.\nAlso, see this StackOverFlow answer: Are exceptions really for exceptional errors?\n\n\"What is the reason for the try-except-else to exist?\"\n\nThe else-clause itself is interesting. It runs when there is no exception but before the finally-clause. That is its primary purpose. \nWithout the else-clause, the only option to run additional code before finalization would be the clumsy practice of adding the code to the try-clause. That is clumsy because it risks\nraising exceptions in code that wasn't intended to be protected by the try-block.\nThe use-case of running additional unprotected code prior to finalization doesn't arise very often. So, don't expect to see many examples in published code. It is somewhat rare.\nAnother use-case for the else-clause is to perform actions that must occur when no exception occurs and that do not occur when exceptions are handled. For example:\n recip = float('Inf')\n try:\n recip = 1 / f(x)\n except ZeroDivisionError:\n logging.info('Infinite result')\n else:\n logging.info('Finite result')\n\nLastly, the most common use of an else-clause in a try-block is for a bit of beautification (aligning the exceptional outcomes and non-exceptional outcomes at the same level of indentation). This use is always optional and isn't strictly necessary.\n"} {"input": "Why is there no GIL in the Java Virtual Machine? Why does Python need one so bad? I'm hoping someone can provide some insight as to what's fundamentally different about the Java Virtual Machine that allows it to implement threads nicely without the need for a Global Interpreter Lock (GIL), while Python necessitates such an evil.\n", "output": "Python (the language) doesn't need a GIL (which is why it can perfectly be implemented on JVM [Jython] and .NET [IronPython], and those implementations multithread freely). CPython (the popular implementation) has always used a GIL for ease of coding (esp. the coding of the garbage collection mechanisms) and of integration of non-thread-safe C-coded libraries (there used to be a ton of those around;-).\nThe Unladen Swallow project, among other ambitious goals, does plan a GIL-free virtual machine for Python -- to quote that site, \"In addition, we intend to remove the GIL and fix the state of multithreading in Python. We believe this is possible through the implementation of a more sophisticated GC system, something like IBM's Recycler (Bacon et al, 2001).\"\n"} {"input": "Python mysqldb: Library not loaded: libmysqlclient.18.dylib I just compiled and installed mysqldb for python 2.7 on my mac os 10.6. I created a simple test file that imports \nimport MySQLdb as mysql\n\nFirstly, this command is red underlined and the info tells me \"Unresolved import\". Then I tried to run the following simple python code\nimport MySQLdb as mysql\n\ndef main():\n conn = mysql.connect( charset=\"utf8\", use_unicode=True, host=\"localhost\",user=\"root\", passwd=\"\",db=\"\" )\n\nif __name__ == '__main__'():\n main()\n\nWhen executing it I get the following error message\nTraceback (most recent call last):\n File \"/path/to/project/Python/src/cvdv/TestMySQLdb.py\", line 4, in \n import MySQLdb as mysql\n File \"build/bdist.macosx-10.6-intel/egg/MySQLdb/__init__.py\", line 19, in \n \\namespace cvdv\n File \"build/bdist.macosx-10.6-intel/egg/_mysql.py\", line 7, in \n File \"build/bdist.macosx-10.6-intel/egg/_mysql.py\", line 6, in __bootstrap__\nImportError: dlopen(/Users/toom/.python-eggs/MySQL_python-1.2.3-py2.7-macosx-10.6-intel.egg-tmp/_mysql.so, 2): Library not loaded: libmysqlclient.18.dylib\n Referenced from: /Users/toom/.python-eggs/MySQL_python-1.2.3-py2.7-macosx-10.6-intel.egg-tmp/_mysql.so\n Reason: image not found\n\nWhat might be the solution to my problem?\nEDIT:\nActually I found out that the library lies in /usr/local/mysql/lib. So I need to tell my pydev eclipse version where to find it. Where do I set this?\n", "output": "I solved the problem by creating a symbolic link to the library. I.e.\nThe actual library resides in\n/usr/local/mysql/lib\n\nAnd then I created a symbolic link in \n/usr/lib\n\nUsing the command:\nsudo ln -s /usr/local/mysql/lib/libmysqlclient.18.dylib /usr/lib/libmysqlclient.18.dylib\n\nso that I have the following mapping:\nls -l libmysqlclient.18.dylib \nlrwxr-xr-x 1 root wheel 44 16 Jul 14:01 libmysqlclient.18.dylib -> /usr/local/mysql/lib/libmysqlclient.18.dylib\n\nThat was it. After that everything worked fine.\nEDIT:\nNotice, to do this in El Capitan mac os release, you need disable SIP. Here is hint how do it in final release: http://apple.stackexchange.com/a/208481/90910 .\n"} {"input": "How to drop rows of Pandas dataframe whose value of certain column is NaN I have a df :\n>>> df\n STK_ID EPS cash\nSTK_ID RPT_Date \n601166 20111231 601166 NaN NaN\n600036 20111231 600036 NaN 12\n600016 20111231 600016 4.3 NaN\n601009 20111231 601009 NaN NaN\n601939 20111231 601939 2.5 NaN\n000001 20111231 000001 NaN NaN\n\nThen I just want the records whose EPS is not NaN, that is, df.drop(....) will return the dataframe as below:\n STK_ID EPS cash\nSTK_ID RPT_Date \n600016 20111231 600016 4.3 NaN\n601939 20111231 601939 2.5 NaN\n\nHow to do that ?\n", "output": "This question is already resolved, but... \n...also consider the solution suggested by Wouter in his original comment. The ability to handle missing data, including dropna(), is built into pandas explicitly. Aside from potentially improved performance over doing it manually, these functions also come with a variety of options which may be useful. \nIn [24]: df = pd.DataFrame(np.random.randn(10,3))\n\nIn [25]: df.ix[::2,0] = np.nan; df.ix[::4,1] = np.nan; df.ix[::3,2] = np.nan;\n\nIn [26]: df\nOut[26]:\n 0 1 2\n0 NaN NaN NaN\n1 2.677677 -1.466923 -0.750366\n2 NaN 0.798002 -0.906038\n3 0.672201 0.964789 NaN\n4 NaN NaN 0.050742\n5 -1.250970 0.030561 -2.678622\n6 NaN 1.036043 NaN\n7 0.049896 -0.308003 0.823295\n8 NaN NaN 0.637482\n9 -0.310130 0.078891 NaN\n\n\nIn [27]: df.dropna() #drop all rows that have any NaN values\nOut[27]:\n 0 1 2\n1 2.677677 -1.466923 -0.750366\n5 -1.250970 0.030561 -2.678622\n7 0.049896 -0.308003 0.823295\n\n\nIn [28]: df.dropna(how='all') #drop only if ALL columns are NaN\nOut[28]:\n 0 1 2\n1 2.677677 -1.466923 -0.750366\n2 NaN 0.798002 -0.906038\n3 0.672201 0.964789 NaN\n4 NaN NaN 0.050742\n5 -1.250970 0.030561 -2.678622\n6 NaN 1.036043 NaN\n7 0.049896 -0.308003 0.823295\n8 NaN NaN 0.637482\n9 -0.310130 0.078891 NaN\n\n\nIn [29]: df.dropna(thresh=2) #Drop row if it does not have at least two values that are **not** NaN\nOut[29]:\n 0 1 2\n1 2.677677 -1.466923 -0.750366\n2 NaN 0.798002 -0.906038\n3 0.672201 0.964789 NaN\n5 -1.250970 0.030561 -2.678622\n7 0.049896 -0.308003 0.823295\n9 -0.310130 0.078891 NaN\n\n\nIn [30]: df.dropna(subset=[1]) #Drop only if NaN in specific column (as asked in the question)\nOut[30]:\n 0 1 2\n1 2.677677 -1.466923 -0.750366\n2 NaN 0.798002 -0.906038\n3 0.672201 0.964789 NaN\n5 -1.250970 0.030561 -2.678622\n6 NaN 1.036043 NaN\n7 0.049896 -0.308003 0.823295\n9 -0.310130 0.078891 NaN\n\nThere are also other options (See docs at http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.dropna.html), including dropping columns instead of rows. \nPretty handy! \n"} {"input": "How do I get user IP address in django? How do I get user's IP in django?\nI have a view like this:\n# Create your views\nfrom django.contrib.gis.utils import GeoIP\nfrom django.template import RequestContext\nfrom django.shortcuts import render_to_response\n\n\ndef home(request):\n g = GeoIP()\n client_ip = request.META['REMOTE_ADDR']\n lat,long = g.lat_lon(client_ip)\n return render_to_response('home_page_tmp.html',locals())\n\nBut I get this error:\nKeyError at /mypage/\n 'REMOTE_ADDR'\n Request Method: GET\n Request URL: http://mywebsite.com/mypage/\n Django Version: 1.2.4\n Exception Type: KeyError\n Exception Value: \n 'REMOTE_ADDR'\n Exception Location: /mysite/homepage/views.py in home, line 9\n Python Executable: /usr/bin/python\n Python Version: 2.6.6\n Python Path: ['/mysite', '/usr/local/lib/python2.6/dist-packages/flup-1.0.2-py2.6.egg', '/usr/lib/python2.6', '/usr/lib/python2.6/plat-linux2', '/usr/lib/python2.6/lib-tk', '/usr/lib/python2.6/lib-old', '/usr/lib/python2.6/lib-dynload', '/usr/local/lib/python2.6/dist-packages', '/usr/lib/python2.6/dist-packages', '/usr/lib/pymodules/python2.6']\n Server time: Sun, 2 Jan 2011 20:42:50 -0600\n\n", "output": "def get_client_ip(request):\n x_forwarded_for = request.META.get('HTTP_X_FORWARDED_FOR')\n if x_forwarded_for:\n ip = x_forwarded_for.split(',')[0]\n else:\n ip = request.META.get('REMOTE_ADDR')\n return ip\n\nMake sure you have reverse proxy (if any) configured correctly (e.g. mod_rpaf installed for Apache).\nNote: the above uses the first item in X-Forwarded-For, but you might want to use the last item (e.g., in the case of Heroku: Get client's real IP address on Heroku)\nget_client_ip = lambda req: req.META.get('HTTP_X_FORWARDED_FOR') if req.META.get('HTTP_X_FORWARDED_FOR') else ip = request.META.get('REMOTE_ADDR')\n\nAnd then just pass the request as argument to it;\nget_client_ip(request)\n\n"} {"input": "Python, Matplotlib, subplot: How to set the axis range? How can I set the y axis range of the second subplot to e.g. [0,1000] ?\nThe FFT plot of my data (a column in a text file) results in a (inf.?) spike so that the actual data is not visible.\npylab.ylim([0,1000])\n\nhas no effect, unfortunately. This is the whole script:\n# based on http://www.swharden.com/blog/2009-01-21-signal-filtering-with-python/\nimport numpy, scipy, pylab, random\n\nxs = []\nrawsignal = []\nwith open(\"test.dat\", 'r') as f:\n for line in f:\n if line[0] != '#' and len(line) > 0:\n xs.append( int( line.split()[0] ) )\n rawsignal.append( int( line.split()[1] ) )\n\nh, w = 3, 1\npylab.figure(figsize=(12,9))\npylab.subplots_adjust(hspace=.7)\n\npylab.subplot(h,w,1)\npylab.title(\"Signal\")\npylab.plot(xs,rawsignal)\n\npylab.subplot(h,w,2)\npylab.title(\"FFT\")\nfft = scipy.fft(rawsignal)\n#~ pylab.axis([None,None,0,1000])\npylab.ylim([0,1000])\npylab.plot(abs(fft))\n\npylab.savefig(\"SIG.png\",dpi=200)\npylab.show()\n\nOther improvements are also appreciated!\n", "output": "As found in http://www.mofeel.net/582-comp-soft-sys-matlab/54166.aspx\n pylab.ylim([0,1000])\n\nNote: The command has to be executed after the plot!\n"} {"input": "What can you use Python generator functions for? I'm starting to learn Python and I've come across generator functions, those that have a yield statement in them. I want to know what types of problems that these functions are really good at solving.\n", "output": "Generators give you lazy evaluation. You use them by iterating over them, either explicitly with 'for' or implicitly by passing it to any function or construct that iterates. You can think of generators as returning multiple items, as if they return a list, but instead of returning them all at once they return them one-by-one, and the generator function is paused until the next item is requested.\nGenerators are good for calculating large sets of results (in particular calculations involving loops themselves) where you don't know if you are going to need all results, or where you don't want to allocate the memory for all results at the same time. Or for situations where the generator uses another generator, or consumes some other resource, and it's more convenient if that happened as late as possible.\nAnother use for generators (that is really the same) is to replace callbacks with iteration. In some situations you want a function to do a lot of work and occasionally report back to the caller. Traditionally you'd use a callback function for this. You pass this callback to the work-function and it would periodically call this callback. The generator approach is that the work-function (now a generator) knows nothing about the callback, and merely yields whenever it wants to report something. The caller, instead of writing a separate callback and passing that to the work-function, does all the reporting work in a little 'for' loop around the generator.\nFor example, say you wrote a 'filesystem search' program. You could perform the search in its entirety, collect the results and then display them one at a time. All of the results would have to be collected before you showed the first, and all of the results would be in memory at the same time. Or you could display the results while you find them, which would be more memory efficient and much friendlier towards the user. The latter could be done by passing the result-printing function to the filesystem-search function, or it could be done by just making the search function a generator and iterating over the result.\nIf you want to see an example of the latter two approaches, see os.path.walk() (the old filesystem-walking function with callback) and os.walk() (the new filesystem-walking generator.) Of course, if you really wanted to collect all results in a list, the generator approach is trivial to convert to the big-list approach:\nbig_list = list(the_generator)\n\n"} {"input": "ImportError: Cannot import name X I have four different files named: main, vector, entity, and physics. I will not post all the code, mostly just the imports, because I think that's where the error is. But if you want, I can post more.\nMain:\nimport time\nfrom entity import Ent\nfrom vector import Vect\n#the rest just creates an entity and prints the result of movement\n\nEntity:\nfrom vector import Vect\nfrom physics import Physics\nclass Ent:\n #holds vector information and id\ndef tick(self, dt):\n #this is where physics changes the velocity and position vectors\n\nVector:\nfrom math import *\nclass Vect:\n #holds i, j, k, and does vector math\n\nPhysics:\nfrom entity import Ent\nclass Physics:\n #physics class gets an entity and does physics calculations on it.\n\nI then run from main.py and I get the following error:\nTraceback (most recent call last):\nFile \"main.py\", line 2, in \n from entity import Ent\nFile \".../entity.py\", line 5, in \n from physics import Physics\nFile \".../physics.py\", line 2, in \n from entity import Ent\nImportError: cannot import name Ent\n\nI am very new to python, but have worked with C++ for a long time. I am guessing that the cause is from importing entity twice, once in main, and later in physics, but I don't know a workaround. Anyone help?\n", "output": "You have circular dependent imports. physics.py is imported from entity before class Ent is defined and physics tries to import entity that is already initializing. Remove the depency to physics from entity module.\n"} {"input": "Python read a single character from the user Is there a way of reading one single character from the user input? For instance, they press one key at the terminal and it is returned (sort of like getch()). I know there's a function in Windows for it, but I'd like something that is cross-platform.\n", "output": "Here's a link to a site that says how you can read a single character in Windows, Linux and OSX: http://code.activestate.com/recipes/134892/\nclass _Getch:\n \"\"\"Gets a single character from standard input. Does not echo to the\nscreen.\"\"\"\n def __init__(self):\n try:\n self.impl = _GetchWindows()\n except ImportError:\n self.impl = _GetchUnix()\n\n def __call__(self): return self.impl()\n\n\nclass _GetchUnix:\n def __init__(self):\n import tty, sys\n\n def __call__(self):\n import sys, tty, termios\n fd = sys.stdin.fileno()\n old_settings = termios.tcgetattr(fd)\n try:\n tty.setraw(sys.stdin.fileno())\n ch = sys.stdin.read(1)\n finally:\n termios.tcsetattr(fd, termios.TCSADRAIN, old_settings)\n return ch\n\n\nclass _GetchWindows:\n def __init__(self):\n import msvcrt\n\n def __call__(self):\n import msvcrt\n return msvcrt.getch()\n\n\ngetch = _Getch()\n\n"} {"input": "How do I create a slug in Django? I am trying to create a SlugField in Django.\nI created this simple model:\nfrom django.db import models\n\nclass Test(models.Model):\n q = models.CharField(max_length=30)\n s = models.SlugField()\n\nI then do this:\n>>> from mysite.books.models import Test\n>>> t=Test(q=\"aa a a a\", s=\"b b b b\")\n>>> t.s\n'b b b b'\n>>> t.save()\n>>> t.s\n'b b b b'\n>>> \n\nI was expecting b-b-b-b\n", "output": "You will need to use the slugify function. \n>>> from django.template.defaultfilters import slugify\n>>> slugify(\"b b b b\")\nu'b-b-b-b'\n>>>\n\nYou can call slugify automatically by overriding the save method:\nclass test(models.Model):\n q = models.CharField(max_length=30)\n s = models.SlugField()\n\n def save(self, *args, **kwargs):\n self.s = slugify(self.q)\n super(test, self).save(*args, **kwargs)\n\nBe aware that the above will cause your URL to change when the q field is edited, which can cause broken links. It may be preferable to generate the slug only once when you create a new object:\nclass test(models.Model):\n q = models.CharField(max_length=30)\n s = models.SlugField()\n\n def save(self, *args, **kwargs):\n if not self.id:\n # Newly created object, so set slug\n self.s = slugify(self.q)\n\n super(test, self).save(*args, **kwargs)\n\n"} {"input": "Why aren't python nested functions called closures? I have seen and used nested functions in Python, and they match the definition of a closure. So why are they called nested functions instead of closures?\nAre nested functions not closures because they are not used by the external world?\nUPDATE: I was reading about closures and it got me thinking about this concept with respect to Python. I searched and found the article mentioned by someone in a comment below, but I couldn't completely understand the explanation in that article, so that is why I am asking this question.\n", "output": "A closure occurs when a function has access to a local variable from an enclosing scope that has finished its execution.\ndef make_printer(msg):\n def printer():\n print msg\n return printer\n\nprinter = make_printer('Foo!')\nprinter()\n\nWhen make_printer is called, a new frame is put on the stack with the compiled code for the printer function as a constant and the value of msg as a local. It then creates and returns the function. Because the function printer references the msg variable, it is kept alive after the make_printer function has returned. \nSo, if your nested functions don't \n\naccess variables that are local to enclosing scopes, \ndo so when they are executed outside of that scope,\n\nthen they are not closures.\nHere's an example of a nested function which is not a closure.\ndef make_printer(msg):\n def printer(msg=msg):\n print msg\n return printer\n\nprinter = make_printer(\"Foo!\")\nprinter() #Output: Foo!\n\nHere, we are binding the value to the default value of a parameter. This occurs when the function printer is created and so no reference to the value of msg external to printer needs to be maintained after make_printer returns. msg is just a normal local variable of the function printer in this context. \n"} {"input": "Lazy Method for Reading Big File in Python? I have a very big file 4GB and when I try to read it my computer hangs.\nSo I want to read it piece by piece and after processing each piece store the processed piece into another file and read next piece.\nIs there any method to yield these pieces ?\nI would love to have a lazy method.\n", "output": "To write a lazy function, just use yield:\ndef read_in_chunks(file_object, chunk_size=1024):\n \"\"\"Lazy function (generator) to read a file piece by piece.\n Default chunk size: 1k.\"\"\"\n while True:\n data = file_object.read(chunk_size)\n if not data:\n break\n yield data\n\n\nf = open('really_big_file.dat')\nfor piece in read_in_chunks(f):\n process_data(piece)\n\n\nAnother option would be to use iter and a helper function:\nf = open('really_big_file.dat')\ndef read1k():\n return f.read(1024)\n\nfor piece in iter(read1k, ''):\n process_data(piece)\n\n\nIf the file is line-based, the file object is already a lazy generator of lines:\nfor line in open('really_big_file.dat'):\n process_data(line)\n\n"} {"input": "check if all elements in a list are identical I need the following function:\nInput: a list\nOutput: \n\nTrue if all elements in the input list evaluate as equal to each other using the standard equality operator; \nFalse otherwise.\n\nPerformance: of course, I prefer not to incur any unnecessary overhead.\nI feel it would be best to: \n\niterate through the list \ncompare adjacent elements\nand AND all the resulting Boolean values\n\nBut I'm not sure what's the most Pythonic way to do that.\n\nEDIT:\nThank you for all the great answers. I rated up several, and it was really hard to choose between @KennyTM and @Ivo van der Wijk solutions.\nThe lack of short-circuit feature only hurts on a long input (over ~50 elements) that have unequal elements early on. If this occurs often enough (how often depends on how long the lists might be), the short-circuit is required. The best short-circuit algorithm seems to be @KennyTM checkEqual1. It pays, however, a significant cost for this:\n\nup to 20x in performance nearly-identical lists\nup to 2.5x in performance on short lists\n\nIf the long inputs with early unequal elements don't happen (or happen sufficiently rarely), short-circuit isn't required. Then, by far the fastest is @Ivo van der Wijk solution.\n", "output": "General method:\ndef checkEqual1(iterator):\n iterator = iter(iterator)\n try:\n first = next(iterator)\n except StopIteration:\n return True\n return all(first == rest for rest in iterator)\n\nOne-liner:\ndef checkEqual2(iterator):\n return len(set(iterator)) <= 1\n\nAlso one-liner:\ndef checkEqual3(lst):\n return lst[1:] == lst[:-1]\n\nThe difference between the 3 versions are that:\n\nIn checkEqual2 the content must be hashable.\ncheckEqual1 and checkEqual2 can use any iterators, but checkEqual3 must take a sequence input, typically concrete containers like a list or tuple.\ncheckEqual1 stops as soon as a difference is found.\nSince checkEqual1 contains more Python code, it is less efficient when many of the items are equal in the beginning.\nSince checkEqual2 and checkEqual3 always perform O(N) copying operations, they will take longer if most of your input will return False.\ncheckEqual2 and checkEqual3 can't be easily changed to adopt to compare a is b instead of a == b.\n\n\ntimeit result, for Python 2.7 and (only s1, s4, s7, s9 should return True)\ns1 = [1] * 5000\ns2 = [1] * 4999 + [2]\ns3 = [2] + [1]*4999\ns4 = [set([9])] * 5000\ns5 = [set([9])] * 4999 + [set([10])]\ns6 = [set([10])] + [set([9])] * 4999\ns7 = [1,1]\ns8 = [1,2]\ns9 = []\n\nwe get\n | checkEqual1 | checkEqual2 | checkEqual3 | checkEqualIvo | checkEqual6502 |\n|-----|-------------|-------------|--------------|---------------|----------------|\n| s1 | 1.19 msec | 348 usec | 183 usec | 51.6 usec | 121 usec |\n| s2 | 1.17 msec | 376 usec | 185 usec | 50.9 usec | 118 usec |\n| s3 | 4.17 usec | 348 usec | 120 usec | 264 usec | 61.3 usec |\n| | | | | | |\n| s4 | 1.73 msec | | 182 usec | 50.5 usec | 121 usec |\n| s5 | 1.71 msec | | 181 usec | 50.6 usec | 125 usec |\n| s6 | 4.29 usec | | 122 usec | 423 usec | 61.1 usec |\n| | | | | | |\n| s7 | 3.1 usec | 1.4 usec | 1.24 usec | 0.932 usec | 1.92 usec |\n| s8 | 4.07 usec | 1.54 usec | 1.28 usec | 0.997 usec | 1.79 usec |\n| s9 | 5.91 usec | 1.25 usec | 0.749 usec | 0.407 usec | 0.386 usec |\n\nNote:\n# http://stackoverflow.com/q/3844948/\ndef checkEqualIvo(lst):\n return not lst or lst.count(lst[0]) == len(lst)\n\n# http://stackoverflow.com/q/3844931/\ndef checkEqual6502(lst):\n return not lst or [lst[0]]*len(lst) == lst\n\n"} {"input": "Get MD5 hash of big files in Python I have used hashlib (which replaces md5 in Python 2.6/3.0) and it worked fine if I opened a file and put its content in hashlib.md5() function.\nThe problem is with very big files that their sizes could exceed RAM size.\nHow to get the MD5 hash of a file without loading the whole file to memory?\n", "output": "You need to read the file in chunks of suitable size:\ndef md5_for_file(f, block_size=2**20):\n md5 = hashlib.md5()\n while True:\n data = f.read(block_size)\n if not data:\n break\n md5.update(data)\n return md5.digest()\n\nNOTE: Make sure you open your file with the 'rb' to the open - otherwise you will get the wrong result.\nSo to do the whole lot in one method - use something like: \ndef generate_file_md5(rootdir, filename, blocksize=2**20):\n m = hashlib.md5()\n with open( os.path.join(rootdir, filename) , \"rb\" ) as f:\n while True:\n buf = f.read(blocksize)\n if not buf:\n break\n m.update( buf )\n return m.hexdigest()\n\nThe update above was based on the comments provided by Frerich Raabe - and I tested this and found it to be correct on my Python 2.7.2 windows installation\nI cross-checked the results using the 'jacksum' tool.\njacksum -a md5 \n\nhttp://www.jonelo.de/java/jacksum/\n"} {"input": "How to exit from Python without traceback? I would like to know how to I exit from Python without having an traceback dump on the output. \nI still want want to be able to return an error code but I do not want to display the traceback log.\nI want to be able to exit using exit(number) without trace but in case of an Exception (not an exit) I want the trace.\n", "output": "You are presumably encountering an exception and the program is exiting because of this (with a traceback). The first thing to do therefore is to catch that exception, before exiting cleanly (maybe with a message, example given).\nTry something like this in your main routine:\nimport sys, traceback\n\ndef main():\n try:\n do main program stuff here\n ....\n except KeyboardInterrupt:\n print \"Shutdown requested...exiting\"\n except Exception:\n traceback.print_exc(file=sys.stdout)\n sys.exit(0)\n\nif __name__ == \"__main__\":\n main()\n\n"} {"input": "How do I execute a string containing Python code in Python? How do I execute a string containing Python code in Python?\n", "output": "For statements, use exec(string) (Python 2/3) or exec string (Python 2):\n>>> mycode = 'print \"hello world\"'\n>>> exec(mycode)\nHello world\n\nWhen you need the value of an expression, use eval(string):\n>>> x = eval(\"2+2\")\n>>> x\n4\n\nHowever, the first step should be to ask yourself if you really need to. Executing code should generally be the position of last resort: It's slow, ugly and dangerous if it can contain user-entered code. You should always look at alternatives first, such as higher order functions, to see if these can better meet your needs.\n"} {"input": "How to comment out a block of Python code in Vim I was wondering if there was any key mapping in Vim to allow me to indent certain lines of code (whether those lines have been selected in visual mode, or n lines above/below current cursor position).\nSo basically something that converts the following \ndef my_fun(x, y):\n return x + y\n\nto\n#def my_fun(x, y):\n# return x + y\n\nI am okay with using either # or \"\"\" for commenting out the relevant lines. Ideally, I would also like the same keymapping to uncomment the lines if the given lines have been commented out.\n", "output": "Step 1: Go to the the first column of the first line you want to comment.\n\nStep 2: Press: Ctrl+v and select the lines you want to comment:\n\nStep 3: Shift-I#space (Enter Insert-at-left mode, type chars to insert. The selection will disappear, but all lines within it will be modified after Step 4.)\n\nStep 4: Esc\n\n"} {"input": "Store output of subprocess.Popen call in a string I'm trying to make a system call in Python and store the output to a string that I can manipulate in the Python program.\n#!/usr/bin/python\nimport subprocess\np2 = subprocess.Popen(\"ntpq -p\")\n\nI've tried a few things including some of the suggestions here:\nRetrieving the output of subprocess.call()\nbut without any luck.\n", "output": "In Python 2.7 or Python 3\nInstead of making a Popen object directly, you can use the subprocess.check_output() function to store output of a command in a string:\nfrom subprocess import check_output\n\nout = check_output([\"ntpq\", \"-p\"])\n\nIn Python 2.4-2.6\nUse the communicate method.\nimport subprocess\np = subprocess.Popen([\"ntpq\", \"-p\"], stdout=subprocess.PIPE)\nout, err = p.communicate()\n\nout is what you want.\nImportant note about the other answers\nNote how I passed in the command. The \"ntpq -p\" example brings up another matter. Since Popen does not involke the shell, you would use a list of the command and options\u00e2\u0080\u0094[\"ntpq\", \"-p\"].\n"} {"input": "Getting a map() to return a list in Python 3.x I'm trying to map a list into hex, and then use the list elsewhere. In python 2.6, this was easy:\nA: Python 2.6:\n>>> map(chr, [66, 53, 0, 94])\n['B', '5', '\\x00', '^']\n\nHowever, on Python 3.1, the above returns a map object.\nB: Python 3.1:\n>>> map(chr, [66, 53, 0, 94])\n\n\nHow do I retrieve the mapped list (as in A above) on Python 3.x? \nAlternatively, is there a better way of doing this? My initial list object has around 45 items and id like to convert them to hex.\n", "output": "Do this:\nlist(map(chr,[66,53,0,94]))\n\nIn Python 3+, many processes that iterate over iterables return iterators themselves. In most cases, this ends up saving memory, and should make things go faster. \nIf all you're going to do is iterate over this list eventually, there's no need to even convert it to a list, because you can still iterate over the map object like so:\n# Prints \"ABCD\"\nfor ch in map(chr,[65,66,67,68]):\n print(ch)\n\n"} {"input": "Pretty-printing of numpy.array I'm curious, whether there is any way to print formatted numpy.arrays, e.g., in the way similar to this:\nx = 1.23456\nprint '%.3f' % x\n\nIf I want to print the numpy.array of floats, it prints several decimals, often in 'scientific' format, which is rather hard to read even for low-dimensional arrays. However, numpy.array apparently has to be printed as a string, i.e., with %s. Is there any solution for this purpose? \n", "output": "You can use set_printoptions to set the precision of the output:\nimport numpy as np\nx=np.random.random(10)\nprint(x)\n# [ 0.07837821 0.48002108 0.41274116 0.82993414 0.77610352 0.1023732\n# 0.51303098 0.4617183 0.33487207 0.71162095]\n\nnp.set_printoptions(precision=3)\nprint(x)\n# [ 0.078 0.48 0.413 0.83 0.776 0.102 0.513 0.462 0.335 0.712]\n\nAnd suppress suppresses the use of scientific notation for small numbers:\ny=np.array([1.5e-10,1.5,1500])\nprint(y)\n# [ 1.500e-10 1.500e+00 1.500e+03]\nnp.set_printoptions(suppress=True)\nprint(y)\n# [ 0. 1.5 1500. ]\n\nSee the docs for set_printoptions for other options.\n\nTo apply print options locally, you could use a contextmanager:\nimport numpy as np\nimport contextlib\n\n@contextlib.contextmanager\ndef printoptions(*args, **kwargs):\n original = np.get_printoptions()\n np.set_printoptions(*args, **kwargs)\n yield \n np.set_printoptions(**original)\n\nFor example, inside the with-suite precision=3 and suppress=True are set:\nx = np.random.random(10)\nwith printoptions(precision=3, suppress=True):\n print(x)\n # [ 0.073 0.461 0.689 0.754 0.624 0.901 0.049 0.582 0.557 0.348]\n\nBut outside the with-suite the print options are back to default settings:\nprint(x) \n# [ 0.07334334 0.46132615 0.68935231 0.75379645 0.62424021 0.90115836\n# 0.04879837 0.58207504 0.55694118 0.34768638]\n\n\nTo prevent zeros from being stripped from the end of floats:\nnp.set_printoptions now has a formatter parameter which allows you to specify a format function for each type.\nnp.set_printoptions(formatter={'float': '{: 0.3f}'.format})\nprint(x)\n\nwhich prints\n[ 0.078 0.480 0.413 0.830 0.776 0.102 0.513 0.462 0.335 0.712]\n\ninstead of \n[ 0.078 0.48 0.413 0.83 0.776 0.102 0.513 0.462 0.335 0.712]\n\n"} {"input": "Convert list of dictionaries to Dataframe I have a list of dictionaries like this:\n[{'points': 50, 'time': '5:00', 'year': 2010}, \n{'points': 25, 'time': '6:00', 'month': \"february\"}, \n{'points':90, 'time': '9:00', 'month': 'january'}, \n{'points_h1':20, 'month': 'june'}]\n\nand I want to turn this into a pandas DataFrame like this:\n month points points_h1 time year\n0 NaN 50 NaN 5:00 2010\n1 february 25 NaN 6:00 NaN\n2 january 90 NaN 9:00 NaN\n3 june NaN 20 NaN NaN\n\nNote: Order of the columns does not matter.\nUltimately, the goal is to write this to a text file and this seems like the best solution I could find. How can I turn the list of dictionaries into a panda DataFrame as shown above?\n", "output": "Supposing d is your list of dicts, simply:\npd.DataFrame(d)\n\n"} {"input": "Having Django serve downloadable files I want users on the site to be able to download files whose paths are obscured so they cannot be directly downloaded.\nFor instance, I'd like the URL to be something like this, \"http://example.com/download/?f=somefile.txt\nAnd on the server, I know that all downloadable files reside in a folder \"/home/user/files/\".\nIs there a way to make Django serve that file for download as opposed to trying to find a URL and View to display it?\n", "output": "For the \"best of both worlds\" you could combine S.Lott's solution with the xsendfile module: django generates the path to the file (or the file itself), but the actual file serving is handled by Apache/Lighttpd. Once you've set up mod_xsendfile, integrating with your view takes a few lines of code:\nfrom django.utils.encoding import smart_str\n\nresponse = HttpResponse(mimetype='application/force-download') # mimetype is replaced by content_type for django 1.7\nresponse['Content-Disposition'] = 'attachment; filename=%s' % smart_str(file_name)\nresponse['X-Sendfile'] = smart_str(path_to_file)\n# It's usually a good idea to set the 'Content-Length' header too.\n# You can also set any other required headers: Cache-Control, etc.\nreturn response\n\nOf course, this will only work if you have control over your server, or your hosting company has mod_xsendfile already set up.\n"} {"input": "Compiled vs. Interpreted Languages I'm trying to get a better understanding of the difference. I've found a lot of explanations online, but they tend towards the abstract differences rather than the practical implications.\nMost of my programming experiences has been with CPython (dynamic, interpreted), and Java (static, compiled). However, I understand that there are other kinds of interpreted and compiled languages. Aside from the fact that executable files can be distributed from programs written in compiled languages, are there any advantages/disadvantages to each type? Oftentimes, I hear people arguing that interpreted languages can be used interactively, but I believe that compiled languages can have interactive implementations as well, correct?\n", "output": "A compiled language is one where the program, once compiled, is expressed in the instructions of the target machine. For example, an addition \"+\" operation in your source code could be translated directly to the \"ADD\" instruction in machine code.\nAn interpreted language is one where the instructions are not directly executed by the target machine, but instead read and executed by some other program (which normally is written in the language of the native machine). For example, the same \"+\" operation would be recognised by the interpreter at run time, which would then call its own \"add(a,b)\" function with the appropriate arguments, which would then execute the machine code \"ADD\" instruction.\nYou can do anything that you can do in an interpreted language in a compiled language and vice-versa - they are both Turing complete. Both however have advantages and disadvantages for implementation and use.\nI'm going to completely generalise (purists forgive me!) but, roughly, here are the advantages of compiled languages:\n\nFaster performance by directly using the native code of the target machine\nOpportunity to apply quite powerful optimisations during the compile stage\n\nAnd here are the advantages of interpreted languages:\n\nEasier to implement (writing good compilers is very hard!!)\nNo need to run a compilation stage: can execute code directly \"on the fly\"\nCan be more convenient for dynamic languages\n\nNote that modern techniques such as bytecode compilation add some extra complexity - what happens here is that the compiler targets a \"virtual machine\" which is not the same as the underlying hardware. These virtual machine instructions can then be compiled again at a later stage to get native code (e.g. as done by the Java JVM JIT compiler).\n"} {"input": "Print a list in reverse order with range()? How can you produce the following list with range() in Python?\n[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]\n\n", "output": "Use the 'range' built-in function. The signature is range(start, stop, step). This produces a sequence that yields numbers, starting with start, and ending if stop has been reached, excluding stop.\n>>> range(9,-1,-1) \n [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]\n>>> range(-2, 6, 2)\n [-2, 0, 2, 4]\n\nIn Python 3, this produces a non-list range object, which functions effectively like a read-only list (but uses way less memory, particularly for large ranges).\n"} {"input": "Numpy array dimensions I'm currently trying to learn Numpy and Python. Given the following array:\nimport numpy as N\na = N.array([[1,2],[1,2]])\n\nIs there a function that returns the dimensions of a (e.g.a is a 2 by 2 array)?\nsize() returns 4 and that doesn't help very much. \n", "output": "It is .shape:\n\nndarray.shape\n Tuple of array dimensions.\n\nThus:\n>>> a.shape\n(2, 2)\n\n"} {"input": "How to retrieve an element from a set without removing it? Suppose the following:\n>>>s = set([1, 2, 3])\n\nHow do I get a value (any value) out of s without doing s.pop()? I want to leave the item in the set until I am sure I can remove it - something I can only be sure of after an asynchronous call to another host.\nQuick and dirty:\n>>>elem = s.pop()\n>>>s.add(elem)\n\nBut do you know of a better way? Ideally in constant time.\n", "output": "Two options that don't require copying the whole set:\nfor e in s:\n break\n# e is now an element from s\n\nOr...\ne = next(iter(s))\n\nBut in general, sets don't support indexing or slicing.\n"} {"input": "What is an alternative to execfile in Python 3.0? It seems they canceled in Python 3.0 all the easy way to quickly load a script file - both execfile() and reload(). \nIs there an obvious alternative I'm missing?\n", "output": "You are just supposed to read the file and exec the code yourself. 2to3 current replaces\nexecfile(\"somefile.py\", global_vars, local_vars)\n\nas\nwith open(\"somefile.py\") as f:\n code = compile(f.read(), \"somefile.py\", 'exec')\n exec(code, global_vars, local_vars)\n\n(The compile call isn't strictly needed, but it associates the filename with the code object making debugging a little easier.)\nSee:\n\nhttp://docs.python.org/release/2.7.3/library/functions.html#execfile\nhttp://docs.python.org/release/3.2.3/library/functions.html#compile\nhttp://docs.python.org/release/3.2.3/library/functions.html#exec\n\n"} {"input": "How do I integrate Ajax with Django applications? I am new to Django and pretty new to Ajax. I am working on a project where I need to integrate the two. I believe that I understand the principles behind them both, but have not found a good explanation of the two together. \nCould someone give me a quick explanation of how the codebase must change with the two of them integrating together?\nFor example, can I still use the HttpResponse with Ajax, or do my responses have to change with the use of Ajax? If so, could you please provide an example of how the responses to the requests must change? If it makes any difference, the data I am returning is JSON. \n", "output": "Even though this isn't entirely in the SO spirit, I love this question, because I had the same trouble when I started so I'll give you a quick guide. Obviously you don't understand the principles behind them (don't take it as an offense, but if you did you wouldn't be asking). \nDjango is server-side. It means, say a client goes to url you have a function inside views that renders what he sees and returns a response in html. let's break it up into examples:\nviews.py\ndef hello(request):\n return HttpResponse('Hello World!')\n\ndef home(request):\n return render_to_response('index.html', {'variable': 'world'})\n\nindex.html:\n

Hello {{ variable }}, welcome to my awesome site

\n\nurls.py\nurl(r'^hello/', 'myapp.views.hello'),\nurl(r'^home/', 'myapp.views.home'),\n\nThat's an example of the simplest of usages. Going to 127.0.0.1:8000/hello means a request to the hello function, going to 127.0.0.1:8000/home will return the index.html and replace all the variables as asked (you probably know all this by now).\nNow let's talk about AJAX. AJAX calls are client-side code that does asynchronous requests. That sounds complicated, but it simply means it does a request for you in the background and then handles the response. So when you do an AJAX call for some url, you get the same data you would get as a user going to that place. \nFor example, an ajax call to 127.0.0.1:8000/hello will return the same thing it would as if you visited it. Only this time, you have it inside a js function and you can deal with it however you'd like. Let's look at a simple use case:\n$.ajax({\n url: '127.0.0.1:8000/hello',\n type: 'get', // This is the default though, you don't actually need to always mention it\n success: function(data) {\n alert(data);\n },\n failure: function(data) { \n alert('Got an error dude');\n }\n}); \n\nThe general process is this:\n\nThe call goes to the url 127.0.0.1:8000/hello as if you opened a new tab and did it yourself.\nIf it succeeds (status code 200), do the function for success, which will alert the data recieved.\nIf fails, do a different function.\n\nNow what would happen here? You would get an alert with 'hello world' in it. What happens if you do an ajax call to home? Same thing, you'll get an alert stating

Hello world, welcome to my awesome site

.\nIn other words - there's nothing new about AJAX calls. They are just a way for you to let the user get data and information without leaving the page, and it makes for a smooth and very neat design of your website. A few guidelines you should take note of:\n\nLearn jQuery. I cannot stress this enough. You're gonna have to understand it a little to know how to handle the data you receive. You'll also need to understand some basic javascript syntax (not far from python, you'll get used to it). I strongly recommend Envato's video tutorials for jQuery, they are great and will put you on the right path.\nWhen to use JSON?. You're going to see a lot of examples where the data sent by the Django views is in JSON. I didn't go into detail on that, because it isn't important how to do it (there are plenty of explanations abound) and a lot more important when. And the answer to that is - JSON data is serialized data. That is, data you can manipulate. Like I mentioned, an AJAX call will fetch the response as if the user did it himself. Now say you don't want to mess with all the html, and instead want to send data (a list of objects perhaps). JSON is good for this, because it sends it as an object (JSON data looks like a python dictionary), and then you can iterate over it or do something else that removes the need to sift through useless html.\nAdd it last. When you build a web app and want to implement AJAX - do yourself a favor. First, build the entire app completely devoid of any AJAX. See that everything is working. Then, and only then, start writing the AJAX calls. That's a good process that helps you learn a lot as well.\nUse chrome's developer tools. Since AJAX calls are done in the background it's sometimes very hard to debug them. You should use the chrome developer tools (or similar tools such as firebug) and console.log things to debug. I won't explain in detail, just google around and find out about it. It would be very helpful to you.\nCSRF awareness. Finally, remember that post requests in Django require the csrf_token. With AJAX calls, a lot of times you'd like to send data without refreshing the page. You'll probably face some trouble before you'd finally remember that - wait, you forgot to send the csrf_token. This is a known beginner roadblock in AJAX-Django integration, but after you learn how to make it play nice, it's easy as pie.\n\nThat's everything that comes to my head. It's a vast subject, but yeah, there's probably not enough examples out there. Just work your way there, slowly, you'll get it eventually.\n"} {"input": "Generating a PNG with matplotlib when DISPLAY is undefined I am trying to use networkx with Python. When I run this program it get this error. Is there anything missing?\n#!/usr/bin/env python\n\nimport networkx as nx\nimport matplotlib\nimport matplotlib.pyplot\nimport matplotlib.pyplot as plt\n\nG=nx.Graph()\nG.add_node(1)\nG.add_nodes_from([2,3,4,5,6,7,8,9,10])\n#nx.draw_graphviz(G)\n#nx_write_dot(G, 'node.png')\nnx.draw(G)\nplt.savefig(\"/var/www/node.png\")\n\n\nTraceback (most recent call last):\n File \"graph.py\", line 13, in \n nx.draw(G)\n File \"/usr/lib/pymodules/python2.5/networkx/drawing/nx_pylab.py\", line 124, in draw\n cf=pylab.gcf()\n File \"/usr/lib/pymodules/python2.5/matplotlib/pyplot.py\", line 276, in gcf\n return figure()\n File \"/usr/lib/pymodules/python2.5/matplotlib/pyplot.py\", line 254, in figure\n **kwargs)\n File \"/usr/lib/pymodules/python2.5/matplotlib/backends/backend_tkagg.py\", line 90, in new_figure_manager\n window = Tk.Tk()\n File \"/usr/lib/python2.5/lib-tk/Tkinter.py\", line 1650, in __init__\n self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)\n_tkinter.TclError: no display name and no $DISPLAY environment variable\n\n\nI get a different error now:\n#!/usr/bin/env python\n\nimport networkx as nx\nimport matplotlib\nimport matplotlib.pyplot\nimport matplotlib.pyplot as plt\n\nmatplotlib.use('Agg')\n\nG=nx.Graph()\nG.add_node(1)\nG.add_nodes_from([2,3,4,5,6,7,8,9,10])\n#nx.draw_graphviz(G)\n#nx_write_dot(G, 'node.png')\nnx.draw(G)\nplt.savefig(\"/var/www/node.png\")\n\n\n/usr/lib/pymodules/python2.5/matplotlib/__init__.py:835: UserWarning: This call to matplotlib.use() has no effect\nbecause the the backend has already been chosen;\nmatplotlib.use() must be called *before* pylab, matplotlib.pyplot,\nor matplotlib.backends is imported for the first time.\n\n if warn: warnings.warn(_use_error_msg)\nTraceback (most recent call last):\n File \"graph.py\", line 15, in \n nx.draw(G)\n File \"/usr/lib/python2.5/site-packages/networkx-1.2.dev-py2.5.egg/networkx/drawing/nx_pylab.py\", line 124, in draw\n cf=pylab.gcf()\n File \"/usr/lib/pymodules/python2.5/matplotlib/pyplot.py\", line 276, in gcf\n return figure()\n File \"/usr/lib/pymodules/python2.5/matplotlib/pyplot.py\", line 254, in figure\n **kwargs)\n File \"/usr/lib/pymodules/python2.5/matplotlib/backends/backend_tkagg.py\", line 90, in new_figure_manager\n window = Tk.Tk()\n File \"/usr/lib/python2.5/lib-tk/Tkinter.py\", line 1650, in __init__\n self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)\n_tkinter.TclError: no display name and no $DISPLAY environment variable\n\n\nI get a different error now:\n#!/usr/bin/env python\n\nimport networkx as nx\nimport matplotlib\nimport matplotlib.pyplot\nimport matplotlib.pyplot as plt\n\nmatplotlib.use('Agg')\n\nG=nx.Graph()\nG.add_node(1)\nG.add_nodes_from([2,3,4,5,6,7,8,9,10])\n#nx.draw_graphviz(G)\n#nx_write_dot(G, 'node.png')\nnx.draw(G)\nplt.savefig(\"/var/www/node.png\")\n\n\n/usr/lib/pymodules/python2.5/matplotlib/__init__.py:835: UserWarning: This call to matplotlib.use() has no effect\nbecause the the backend has already been chosen;\nmatplotlib.use() must be called *before* pylab, matplotlib.pyplot,\nor matplotlib.backends is imported for the first time.\n\n if warn: warnings.warn(_use_error_msg)\nTraceback (most recent call last):\n File \"graph.py\", line 15, in \n nx.draw(G)\n File \"/usr/lib/python2.5/site-packages/networkx-1.2.dev-py2.5.egg/networkx/drawing/nx_pylab.py\", line 124, in draw\n cf=pylab.gcf()\n File \"/usr/lib/pymodules/python2.5/matplotlib/pyplot.py\", line 276, in gcf\n return figure()\n File \"/usr/lib/pymodules/python2.5/matplotlib/pyplot.py\", line 254, in figure\n **kwargs)\n File \"/usr/lib/pymodules/python2.5/matplotlib/backends/backend_tkagg.py\", line 90, in new_figure_manager\n window = Tk.Tk()\n File \"/usr/lib/python2.5/lib-tk/Tkinter.py\", line 1650, in __init__\n self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)\n_tkinter.TclError: no display name and no $DISPLAY environment variable\n\n", "output": "The main problem is that (on your system) matplotlib chooses an x-using backend by default. I just had the same problem on one of my servers. The solution for me was to add the following code in a place that gets read before any other pylab/matplotlib/pyplot import:\nimport matplotlib\n# Force matplotlib to not use any Xwindows backend.\nmatplotlib.use('Agg')\n\nThe alternative is to set it in your .matplotlibrc\n"} {"input": "parsing boolean values with argparse I would like to use argparse to parse boolean command-line arguments written as \"--foo True\" or \"--foo False\". For example:\nmy_program --my_boolean_flag False\n\nHowever, the following test code does not do what I would like:\nimport argparse\nparser = argparse.ArgumentParser(description=\"My parser\")\nparser.add_argument(\"--my_bool\", type=bool)\ncmd_line = [\"--my_bool\", \"False\"]\nparsed_args = parser.parse(cmd_line)\n\nSadly, parsed_args.my_bool evaluates to True. This is the case even when I change cmd_line to be [\"--my_bool\", \"\"], which is surprising, since bool(\"\") evalutates to False.\nHow can I get argparse to parse \"False\", \"F\", and their lower-case variants to be False?\n", "output": "I think a more canonical way to do this is via:\ncommand --feature\n\nand\ncommand --no-feature\n\nargparse supports this version nicely:\nparser.add_argument('--feature', dest='feature', action='store_true')\nparser.add_argument('--no-feature', dest='feature', action='store_false')\nparser.set_defaults(feature=True)\n\n\nOf course, if you really want the --arg version, you could pass ast.literal_eval as the \"type\", or a user defined function ...\ndef t_or_f(arg):\n ua = str(arg).upper()\n if 'TRUE'.startswith(ua):\n return True\n elif 'FALSE'.startswith(ua):\n return False\n else:\n pass #error condition maybe?\n\n"} {"input": "How to percent-encode URL parameters in Python? If I do \nurl = \"http://example.com?p=\" + urllib.quote(query)\n\n\nIt doesn't encode / to %2F (breaks OAuth normalization)\nIt doesn't handle Unicode (it throws an exception)\n\nIs there a better library?\n", "output": "From the docs:\nurllib.quote(string[, safe])\n\n\nReplace special characters in string\n using the %xx escape. Letters, digits,\n and the characters '_.-' are never\n quoted. By default, this function is\n intended for quoting the path section\n of the URL.The optional safe parameter\n specifies additional characters that\n should not be quoted \u00e2\u0080\u0094 its default\n value is '/'\n\nThat means passing '' for safe will solve your first issue:\n>>> urllib.quote('/test')\n'/test'\n>>> urllib.quote('/test', safe='')\n'%2Ftest'\n\nAbout the second issue, there is a bug report about it here. Apparently it was fixed in python 3. You can workaround it by encoding as utf8 like this:\n>>> query = urllib.quote(u\"M\u00c3\u00bcller\".encode('utf8'))\n>>> print urllib.unquote(query).decode('utf8')\nM\u00c3\u00bcller\n\nBy the way have a look at urlencode\n"} {"input": "How to add to the pythonpath in windows 7? I have a directory which hosts all of my Django apps (C:\\My_Projects). I want to add this directory to my pythonpath so I can call the apps directly.\nI have tried adding C:\\My_Projects\\; to my Path variable from the Windows GUI (My Computer > Properties > Advanced System Settings > Environment Variables). But it still doesn't read the coltrane module and generates this error:\n\nError: No module named coltrane\n\n", "output": "You know what has worked for me really well on windows.\nMy Computer > Properties > Advanced System Settings > Environment Variables > \nThen under system variables I create a new Variable called PythonPath. In this variable I have C:\\Python27\\Lib;C:\\Python27\\DLLs;C:\\Python27\\Lib\\lib-tk;C:\\other-foolder-on-the-path\n\nThis is the best way that has worked for me which I hadn't found in any of the docs offered.\nEDIT: For those who are not able to get it,\nPlease add \n\nC:\\Python27;\n\nalong with it. Else it will never work.\n"} {"input": "Python argparse: How to insert newline in the help text? I'm using argparse in Python 2.7 for parsing input options. One of my options is a multiple choice. I want to make a list in its help text, e.g.\nfrom argparse import ArgumentParser\n\nparser = ArgumentParser(description='test')\n\nparser.add_argument('-g', choices=['a', 'b', 'g', 'd', 'e'], default='a',\n help=\"Some option, where\\n\"\n \" a = alpha\\n\"\n \" b = beta\\n\"\n \" g = gamma\\n\"\n \" d = delta\\n\"\n \" e = epsilon\")\n\nparser.parse_args()\n\nHowever, argparse strips all newlines and consecutive spaces. The result looks like\n\n~/Downloads:52$ python2.7 x.py -h\nusage: x.py [-h] [-g {a,b,g,d,e}]\n\ntest\n\noptional arguments:\n -h, --help show this help message and exit\n -g {a,b,g,d,e} Some option, where a = alpha b = beta g = gamma d = delta e\n = epsilon\n\nHow to insert newlines in the help text?\n", "output": "Try using RawTextHelpFormatter:\nfrom argparse import RawTextHelpFormatter\nparser = ArgumentParser(description='test', formatter_class=RawTextHelpFormatter)\n\n"} {"input": "What is the ellipsis [...] in a Python list? I was playing around in python. I used the following code in IDLE\np = [1, 2]\np[1:1] = [p]\nprint p\n\nThe output was \n[1, [...], 2]\n\nWhat is this [\u00e2\u0080\u00a6]? Interestingly I could now use this as a list of list of list upto infinity i.e.\np[1][1][1]....\n\nI could write the above as long as I wanted and it would still work.\nEDIT:\n\nHow is it represented in memory?\nWhat's its use? Examples of some cases where it is useful would be helpful.\nAny link to official documentation would be really useful.\n\n", "output": "This is what your code created\n\nIt's a list where the first and last elements are pointing to two numbers (1 and 2) and where the middle element is pointing to the list itself.\nIn Common Lisp when printing circular structures is enabled such an object would be printed as\n#1=#(1 #1# 2)\n\nmeaning that there is an object (labelled 1 with #1=) that is a vector with three elements, the second being the object itself (back-referenced with #1#).\nIn Python instead you just get the information that the structure is circular with [...].\nIn this specific case the description is not ambiguous (it's backward pointing to a list but there is only one list so it must be that one). In other cases may be however ambiguous... for example in\n[1, [2, [...], 3]]\n\nthe backward reference could either to the outer or to the inner list.\nThese two different structures printed in the same way can be created with\nx = [1, [2, 3]]\nx[1][1:1] = [x[1]]\n\ny = [1, [2, 3]]\ny[1][1:1] = [y]\n\nprint x, y\n\nand they would be in memory as\n\n"} {"input": "Creating a new dict in Python I want to build a dictionary in Python. However, all the examples that I see are instantiating a dictionary from a list, etc . ..\nHow do I create a new empty dictionary in Python?\n", "output": "Call dict with no parameters\nnew_dict = dict()\n\nor simply write\nnew_dict = {}\n\n"} {"input": "How do I filter ForeignKey choices in a Django ModelForm? Say I have the following in my models.py:\nclass Company(models.Model):\n name = ...\n\nclass Rate(models.Model):\n company = models.ForeignKey(Company)\n name = ...\n\nclass Client(models.Model):\n name = ...\n company = models.ForeignKey(Company)\n base_rate = models.ForeignKey(Rate)\n\nI.e. there are multiple Companies, each having a range of Rates and Clients. Each Client should have a base Rate that is chosen from it's parent Company's Rates, not another Company's Rates.\nWhen creating a form for adding a Client, I would like to remove the Company choices (as that has already been selected via an \"Add Client\" button on the Company page) and limit the Rate choices to that Company as well.\nHow do I go about this in Django 1.0? \nMy current forms.py file is just boilerplate at the moment:\nfrom models import *\nfrom django.forms import ModelForm\n\nclass ClientForm(ModelForm):\n class Meta:\n model = Client\n\nAnd the views.py is also basic:\nfrom django.shortcuts import render_to_response, get_object_or_404\nfrom models import *\nfrom forms import *\n\ndef addclient(request, company_id):\n the_company = get_object_or_404(Company, id=company_id)\n\n if request.POST:\n \tform = ClientForm(request.POST)\n \tif form.is_valid():\n \t\tform.save()\n \t\treturn HttpResponseRedirect(the_company.get_clients_url())\n else:\n \tform = ClientForm()\n\n return render_to_response('addclient.html', {'form': form, 'the_company':the_company})\n\nIn Django 0.96 I was able to hack this in by doing something like the following before rendering the template:\nmanipulator.fields[0].choices = [(r.id,r.name) for r in Rate.objects.filter(company_id=the_company.id)]\n\nForeignKey.limit_choices_to seems promising but I don't know how to pass in the_company.id and I'm not clear if that will work outside the Admin interface anyway.\nThanks. (This seems like a pretty basic request but if I should redesign something I'm open to suggestions.)\n", "output": "ForeignKey is represented by django.forms.ModelChoiceField, which is a ChoiceField whose choices are a model QuerySet. See the reference for ModelChoiceField.\nSo, provide a QuerySet to the field's queryset attribute. Depends on how your form is built. If you build an explicit form, you'll have fields named directly. \nform.rate.queryset = Rate.objects.filter(company_id=the_company.id)\n\nIf you take the default ModelForm object, form.fields[\"rate\"].queryset = ...\nThis is done explicitly in the view. No hacking around.\n"} {"input": "How do I remove/delete a virtualenv? I created an environment with the following command: virtualenv venv --distribute\nI cannot remove it with the following command: rmvirtualenv venv\nI do an lson my current directory and I still see venv\nThe only way I can remove it seems to be: sudo rm -rf venv\nNote that the environment is not active. I'm running Ubuntu 11.10. Any ideas? I've tried rebooting my system to no avail.\n", "output": "That's it! There is no command for deleting your virtual environment. Simply deactivate it and rid your application of its artifacts by recursively removing it.\n"} {"input": "How to get a complete list of object's methods and attributes? dir(re.compile(pattern)) \n\ndoes not return pattern as one of the lists's elements. Namely it returns:\n['__copy__', '__deepcopy__', 'findall', 'finditer', 'match', 'scanner', 'search', 'split', 'sub', 'subn']\n\nAccording to the manual, it is supposed to contain \n\nthe object's attributes' names, the\n names of its class's attributes, and\n recursively of the attributes of its\n class's base classes.\n\nIt says also that\n\nThe list is not necessarily complete.\n\nIs there a way to get the complete list? I always assumed that dir returns a complete list but apparently it does not...\nAlso: is there a way to list only attributes? Or only methods?\nEdit: this is actually a bug in python -> supposedly it is fixed in the 3.0 branch (and perhaps also in 2.6)\n", "output": "For the complete list of attributes, the short answer is: no. The problem is that the attributes are actually defined as the arguments accepted by the getattr built-in function. As the user can reimplement __getattr__, suddenly allowing any kind of attribute, there is no possible generic way to generate that list. The dir function returns the keys in the __dict__ attribute, i.e. all the attributes accessible if the __getattr__ method is not reimplemented.\nFor the second question, it does not really make sense. Actually, methods are callable attributes, nothing more. You could though filter callable attributes, and, using the inspect module determine the class methods, methods or functions.\n"} {"input": "What is __future__ in Python used for and how/when to use it, and how it works People including me know there is something in Python called __future__ and it appears in quite a few modules I read. And the dull people like me don't know why it's there, and how/when to use it , even after reading the Python's __future__ doc. \nSo any explains with examples to demonstrate it?\nI have got a few answers quickly, which look all correct, in terms of the basic usage.\nHowever and also for further understanding how __future__ works:\nI just realized one key thing that was confusing me when I tried to understand it, that is, how a current python release include something that will be released in future release?\nand how can a program using a new feature in a future python release be compiled successfully by the current python release?\nSo, I guess now that, the current release has already packaged some potential features that will be included in future releases - is this right? but the features are available only by __future__, that is because it doesn't become standard yet - am I right?\n", "output": "With it, you can slowly be accustomed to incompatible changes or to such ones introducing new keywords.\nE.g., for using context managers, you had to do from __future__ import with_statement in 2.5, as the with keyword was new and shouldn't be used as variable names any longer. In order to be able to use a program which uses variables named with, the above import statement is needed.\nAnother example is\nfrom __future__ import division\nprint 8/7 # prints 1.1428571428571428\nprint 8//7 # prints 1\n\nWithout the __future__ stuff, both print statements would print 1.\nThe internal difference is that without that import, / is mapped to the __div__() method, while with it, __truediv__() is used. (In any case, // calls __floordiv__().)\nA propos print: print becomes a function in 3.x, losing its special property as a keyword. So it is the other way round.\n>>> print\n\n>>> from __future__ import print_function\n>>> print\n\n>>>\n\n"} {"input": "Which version of Python do I have installed? I have to run a Python script on a Windows server. How can I know which version of Python I have, and does it even really matter? I was thinking of updating to latest version of Python.\n", "output": "python -V\n\nhttp://docs.python.org/using/cmdline.html#generic-options\n--version may also work (introduced in version 2.5)\n"} {"input": "How do I remove a substring from the end of a string in Python? I have the following code:\nurl = 'abcdc.com'\nprint(url.strip('.com'))\n\nI expected: abcdc\nI got: abcd\nNow I do \nurl.rsplit('.com', 1)\n\nIs there a better way?\n", "output": "You could do this:\nurl = 'abcdc.com'\nif url.endswith('.com'):\n url = url[:-4]\n\nOr using regular expressions:\nimport re\nurl = 'abcdc.com'\nurl = re.sub('\\.com$', '', url)\n\n"} {"input": "Printing Python version in output How can I print version number for the current Python installation in the output? \n", "output": "Try\nimport sys\nprint(sys.version)\n\nThis prints the full version information string. If you only want the python version number, then Bastien L\u00c3\u00a9onard's solution is the best. You might want to examine the full string and see if you need it or portions of it.\n"} {"input": "How to get console window width in python Is there a way in python to programmatically determine the width of the console? I mean the number of characters that fits in one line without wrapping, not the pixel width of the window.\nEdit\nLooking for a solution that works on Linux\n", "output": "import os\nrows, columns = os.popen('stty size', 'r').read().split()\n\nuses the 'stty size' command which according to a thread on the python mailing list is reasonably universal on linux. It opens the 'stty size' command as a file, 'reads' from it, and uses a simple string split to separate the coordinates.\nUnlike the os.environ[\"COLUMNS\"] value (which I can't access in spite of using bash as my standard shell) the data will also be up-to-date whereas I believe the os.environ[\"COLUMNS\"] value would only be valid for the time of the launch of the python interpreter (suppose the user resized the window since then).\n"} {"input": "Python: defaultdict of defaultdict? Is there a way to have a defaultdict(defaultdict(int)) in order to make the following code work?\nfor x in stuff:\n d[x.a][x.b] += x.c_int\n\nd needs to be built ad-hoc, depending on x.a and x.b elements.\nI could use:\nfor x in stuff:\n d[x.a,x.b] += x.c_int\n\nbut then I wouldn't be able to use:\nd.keys()\nd[x.a].keys()\n\n", "output": "Yes like this:\ndefaultdict(lambda : defaultdict(int))\n\n"} {"input": "How can the euclidean distance be calculated with numpy? I have two points in 3D:\n(xa, ya, za)\n(xb, yb, zb)\n\nAnd I want to calculate the distance:\ndist = sqrt((xa-xb)^2 + (ya-yb)^2 + (za-zb)^2)\n\nWhat's the best way to do this with Numpy, or with Python in general? I have:\na = numpy.array((xa ,ya, za))\nb = numpy.array((xb, yb, zb))\n\n", "output": "Use numpy.linalg.norm:\ndist = numpy.linalg.norm(a-b)\n\n"} {"input": "Can \"list_display\" in a Django ModelAdmin display attributes of ForeignKey fields? I have a Person model that has a foreign key relationship to Book. Book has a number of fields, but I'm most concerned about \"author\" (a standard CharField).\nWith that being said, in my PersonAdmin model, I'd like to display \"book.author\" using \"list_display\". I've tried all of the obvious methods for doing so (see below), but nothing seems to work. Any suggestions?\nclass PersonAdmin(admin.ModelAdmin):\n list_display = ['book.author',]\n\n", "output": "As another option, you can do look ups like:\nclass UserAdmin(admin.ModelAdmin):\n list_display = (..., 'get_author')\n\n def get_author(self, obj):\n return obj.book.author\n get_author.short_description = 'Author'\n get_author.admin_order_field = 'book__author'\n\n"} {"input": "How to get current CPU and RAM usage in Python? What's your preferred way of getting current system status (current CPU, RAM, free disk space, etc.) in Python? Bonus points for *nix and Windows platforms.\nThere seems to be a few possible ways of extracting that from my search:\n\nUsing a library such as PSI (that currently seems not actively developed and not supported on multiple platform) or something like pystatgrab (again no activity since 2007 it seems and no support for Windows).\nUsing platform specific code such as using a os.popen(\"ps\") or similar for the *nix systems and MEMORYSTATUS in ctypes.windll.kernel32 (see this recipe on ActiveState) for the Windows platform. One could put a Python class together with all those code snippets.\n\nIt's not that those methods are bad but is there already a well-supported, multi-platform way of doing the same thing?\n", "output": "The psutil library will give you some system information (CPU / Memory usage) on a variety of platforms:\n\npsutil is a module providing an interface for retrieving information on running processes and system utilization (CPU, memory) in a portable way by using Python, implementing many functionalities offered by tools like ps, top and Windows task manager.\nIt currently supports Linux, Windows, OSX, Sun Solaris, FreeBSD, OpenBSD and NetBSD, both 32-bit and 64-bit architectures, with Python versions from 2.6 to 3.5 (users of Python 2.4 and 2.5 may use 2.1.3 version).\n\n"} {"input": "DatabaseError: current transaction is aborted, commands ignored until end of transaction block I got a lot of errors with the message :\n\"DatabaseError: current transaction is aborted, commands ignored until end of transaction block\"\n\nafter changed from python-psycopg to python-psycopg2 as Django project's database engine.\nThe code remains the same, just dont know where those errors are from.\n", "output": "To get rid of the error, roll back the last (erroneous) transaction after you've fixed your code:\nfrom django.db import transaction\ntransaction.rollback()\n\nYou can use try-except to prevent the error from occurring:\nfrom django.db import transaction, DatabaseError\ntry:\n a.save()\nexcept DatabaseError:\n transaction.rollback()\n\nRefer : Django documentation\n"} {"input": "Why use Python's os module methods instead of executing shell commands directly? I am trying to understand what is the motivation behind using Python's library functions for executing OS-specific tasks such as creating files/directories, changing file attributes, etc. instead of just executing those commands via os.system() or subprocess.call()?\nFor example, why would I want to use os.chmod instead of doing os.system(\"chmod...\")? \nI understand that it is more \"pythonic\" to use Python's available library methods as much as possible instead of just executing shell commands directly. But, is there any other motivation behind doing this from a functionality point of view? \nI am only talking about executing simple one-line shell commands here. When we need more control over the execution of the task, I understand that using subprocess module makes more sense, for example.\n", "output": "\nIt's faster, os.system and subprocess.call create new processes which is unnecessary for something this simple. In fact, os.system and subprocess.call with the shell argument usually create at least two new processes: the first one being the shell, and the second one being the command that you're running (if it's not a shell built-in like test).\nSome commands are useless in a separate process. For example, if you run os.spawn(\"cd dir/\"), it will change the current working directory of the child process, but not of the Python process. You need to use os.chdir for that.\nYou don't have to worry about special characters interpreted by the shell. os.chmod(path, mode) will work no matter what the filename is, whereas os.spawn(\"chmod 777 \" + path) will fail horribly if the filename is something like ; rm -rf ~. (Note that you can work around this if you use subprocess.call without the shell argument.)\nYou don't have to worry about filenames that begin with a dash. os.chmod(\"--quiet\", mode) will change the permissions of the file named --quiet, but os.spawn(\"chmod 777 --quiet\") will fail, as --quiet is interpreted as an argument. This is true even for subprocess.call([\"chmod\", \"777\", \"--quiet\"]).\nYou have fewer cross-platform and cross-shell concerns, as Python's standard library is supposed to deal with that for you. Does your system have chmod command? Is it installed? Does it support the parameters that you expect it to support? The os module will try to be as cross-platform as possible and documents when that it's not possible.\nIf the command you're running has output that you care about, you need to parse it, which is trickier than it sounds, as you may forget about corner-cases (filenames with spaces, tabs and newlines in them), even when you don't care about portability.\n\n"} {"input": "How to write inline if statement for print? I need to print some stuff only when a boolean variable is set to True. So, after looking at this, I tried with a simple example:\n>>> a = 100\n>>> b = True\n>>> print a if b\n File \"\", line 1\n print a if b\n ^\nSyntaxError: invalid syntax \n\nSame thing if I write print a if b==True.\nWhat am I missing here?\n", "output": "Python does not have a trailing if statement.\nThere are two kinds of if in Python:\n\nif statement:\nif condition: statement\nif condition:\n block\n\nif expression (introduced in Python 2.5)\nexpression_if_true if condition else expression_if_false\n\n\nAnd note, that both print a and b = a are statements. Only the a part is an expression. So if you write\nprint a if b else 0\n\nit means\nprint (a if b else 0)\n\nand similarly when you write\nx = a if b else 0\n\nit means\nx = (a if b else 0)\n\nNow what would it print/assign if there was no else clause? The print/assignment is still there.\nAnd note, that if you don't want it to be there, you can always write the regular if statement on a single line, though it's less readable and there is really no reason to avoid the two-line variant.\n"} {"input": "How to uninstall Python 2.7 on a Mac OS X 10.6.4? I want to completely remove Python 2.7 from my Mac OS X 10.6.4. I managed to remove the entry from the PATH variable by reverting my .bash_profile. But I also want to remove all directories, files, symlinks, and entries that got installed by the Python 2.7 install package. I've got the install package from http://www.python.org/. What directories/files/configuration file entries do I need to remove? Is there a list somewhere?\n", "output": "The complete list is documented here. Basically, all you need to do is the following:\n\nRemove the Python 2.7 framework\nsudo rm -rf /Library/Frameworks/Python.framework/Versions/2.7\nRemove the Python 2.7 applications directory\nsudo rm -rf \"/Applications/Python 2.7\"\nRemove the symbolic links in /usr/local/bin that point to this Python version see ls -l /usr/local/bin | grep '../Library/Frameworks/Python.framework/Versions/2.7' and then run the following command to remove all the links:\ncd /usr/local/bin/\nls -l /usr/local/bin | grep '../Library/Frameworks/Python.framework/Versions/2.7' | awk '{print $9}' | tr -d @ | xargs rm\n\nIf necessary, edit your shell profile file(s) to remove adding /Library/Frameworks/Python.framework/Versions/2.7 to your PATH environment file. Depending on which shell you use, any of the following files may have been modified:\n~/.bash_login, ~/.bash_profile, ~/.cshrc, ~/.profile, ~/.tcshrc, and/or ~/.zprofile.\n\n"} {"input": "Split by comma and strip whitespace in Python I have some python code that splits on comma, but doesn't strip the whitespace:\n>>> string = \"blah, lots , of , spaces, here \"\n>>> mylist = string.split(',')\n>>> print mylist\n['blah', ' lots ', ' of ', ' spaces', ' here ']\n\nI would rather end up with whitespace removed like this:\n['blah', 'lots', 'of', 'spaces', 'here']\n\nI am aware that I could loop through the list and strip() each item but, as this is Python, I'm guessing there's a quicker, easier and more elegant way of doing it.\n", "output": "Use list comprehension -- simpler, and just as easy to read as a for loop.\nmy_string = \"blah, lots , of , spaces, here \"\n[x.strip() for x in my_string.split(',')]\n\nSee: Python docs on List Comprehension\nA good 2 second explanation of list comprehension.\n"} {"input": "What exactly are Python's iterator, iterable, and iteration protocols? What are the most basic definitions of \"iterable\", \"iterator\" and \"iteration\" in Python?\nI've read multiple definitions but their exact meaning still won't sink in.\nCan someone please help me with the basic idea?\n", "output": "Iteration is a general term for taking each item of something, one after another. Any time you use a loop, explicit or implicit, to go over a group of items, that is iteration.\nIn Python, iterable and iterator have specific meanings.\nAn iterable is an object that has an __iter__ method which returns an iterator, or which defines a __getitem__ method that can take sequential indexes starting from zero (and raises an IndexError when the indexes are no longer valid). So an iterable is an object that you can get an iterator from.\nAn iterator is an object with a next (Python 2) or __next__ (Python 3) method. \nWhenever you use a for loop, or map, or a list comprehension, etc. in Python, the next method is called automatically to get each item from the iterator, thus going through the process of iteration.\nA good place to start learning would be the iterators section of the tutorial and the iterator types section of the standard types page. After you understand the basics, try the iterators section of the Functional Programming HOWTO.\n"} {"input": "What is the best way to implement nested dictionaries in Python? I have a data structure which essentially amounts to a nested dictionary. Let's say it looks like this:\n{'new jersey': {'mercer county': {'plumbers': 3,\n 'programmers': 81},\n 'middlesex county': {'programmers': 81,\n 'salesmen': 62}},\n 'new york': {'queens county': {'plumbers': 9,\n 'salesmen': 36}}}\n\nNow, maintaining and creating this is pretty painful; every time I have a new state/county/profession I have to create the lower layer dictionaries via obnoxious try/catch blocks. Moreover, I have to create annoying nested iterators if I want to go over all the values. \nI could also use tuples as keys, like such:\n{('new jersey', 'mercer county', 'plumbers'): 3,\n ('new jersey', 'mercer county', 'programmers'): 81,\n ('new jersey', 'middlesex county', 'programmers'): 81,\n ('new jersey', 'middlesex county', 'salesmen'): 62,\n ('new york', 'queens county', 'plumbers'): 9,\n ('new york', 'queens county', 'salesmen'): 36}\n\nThis makes iterating over the values very simple and natural, but it is more syntactically painful to do things like aggregations and looking at subsets of the dictionary (e.g. if I just want to go state-by-state).\nBasically, sometimes I want to think of a nested dictionary as a flat dictionary, and sometimes I want to think of it indeed as a complex hierarchy. I could wrap this all in a class, but it seems like someone might have done this already. Alternatively, it seems like there might be some really elegant syntactical constructions to do this. \nHow could I do this better?\nAddendum: I'm aware of setdefault() but it doesn't really make for clean syntax. Also, each sub-dictionary you create still needs to have setdefault() manually set.\n", "output": "class AutoVivification(dict):\n \"\"\"Implementation of perl's autovivification feature.\"\"\"\n def __getitem__(self, item):\n try:\n return dict.__getitem__(self, item)\n except KeyError:\n value = self[item] = type(self)()\n return value\n\nTesting:\na = AutoVivification()\n\na[1][2][3] = 4\na[1][3][3] = 5\na[1][2]['test'] = 6\n\nprint a\n\nOutput:\n{1: {2: {'test': 6, 3: 4}, 3: {3: 5}}}\n\n"} {"input": "pip: dealing with multiple Python versions? Is there any way to make pip play well with multiple versions of Python? For example, I want to use pip to explicitly install things to either my site 2.5 installation or my site 2.6 installation.\nFor example, with easy_install, I use easy_install-2.{5,6}.\nAnd, yes \u00e2\u0080\u0094 I know about virtualenv, and no \u00e2\u0080\u0094 it's not a solution to this particular problem.\n", "output": "Since version 0.8, Pip supports pip-{version}. You can use it the same as easy_install-{version}:\n$ pip-2.5 install myfoopackage\n$ pip-2.6 install otherpackage\n$ pip-2.7 install mybarpackage\n\n\nEDIT: pip changed its schema to use pipVERSION instead of pip-VERSION in version 1.5. You should use the following if you have pip >= 1.5:\n$ pip2.6 install otherpackage\n$ pip2.7 install mybarpackage\n\nCheck https://github.com/pypa/pip/pull/1053 for more details\n\nReferences:\n\nhttps://github.com/pypa/pip/issues/200\nhttp://www.pip-installer.org/docs/pip/en/0.8.3/news.html#id4\n\n"} {"input": "How to install pip with Python 3? I want to install pip. It should support Python 3, but it requires setuptools, which is available only for Python 2.\nHow can I install pip with Python 3?\n", "output": "I was able to install pip for python 3 on Ubuntu just by running sudo apt-get install python3-pip. \n"} {"input": "Why does the floating-point value of 4*0.1 look nice in Python 3 but 3*0.1 doesn't? I know that most decimals don't have an exact floating point representation (Is floating point math broken?).\nBut I don't see why 4*0.1 is printed nicely as 0.4, but 3*0.1 isn't, when\nboth values actually have ugly decimal representations:\n>>> 3*0.1\n0.30000000000000004\n>>> 4*0.1\n0.4\n>>> from decimal import Decimal\n>>> Decimal(3*0.1)\nDecimal('0.3000000000000000444089209850062616169452667236328125')\n>>> Decimal(4*0.1)\nDecimal('0.40000000000000002220446049250313080847263336181640625')\n\n", "output": "The simple answer is because 3*0.1 != 0.3 due to quantization (roundoff) error (whereas 4*0.1 == 0.4 because multiplying by a power of two is usually an \"exact\" operation).\nYou can use the .hex method in Python to view the internal representation of a number (basically, the exact binary floating point value, rather than the base-10 approximation). This can help to explain what's going on under the hood.\n>>> (0.1).hex()\n'0x1.999999999999ap-4'\n>>> (0.3).hex()\n'0x1.3333333333333p-2'\n>>> (0.1*3).hex()\n'0x1.3333333333334p-2'\n>>> (0.4).hex()\n'0x1.999999999999ap-2'\n>>> (0.1*4).hex()\n'0x1.999999999999ap-2'\n\n0.1 is 0x1.999999999999a times 2^-4. The \"a\" at the end means the digit 10 - in other words, 0.1 in binary floating point is very slightly larger than the \"exact\" value of 0.1 (because the final 0x0.99 is rounded up to 0x0.a). When you multiply this by 4, a power of two, the exponent shifts up (from 2^-4 to 2^-2) but the number is otherwise unchanged, so 4*0.1 == 0.4.\nHowever, when you multiply by 3, the little tiny difference between 0x0.99 and 0x0.a0 (0x0.07) magnifies into a 0x0.15 error, which shows up as a one-digit error in the last position. This causes 0.1*3 to be very slightly larger than the rounded value of 0.3.\nPython 3's float repr is designed to be round-trippable, that is, the value shown should be exactly convertible into the original value. Therefore, it cannot display 0.3 and 0.1*3 exactly the same way, or the two different numbers would end up the same after round-tripping. Consequently, Python 3's repr engine chooses to display one with a slight apparent error.\n"} {"input": "Sending HTML email using Python How can I send the HTML content in an email using Python? I can send simple text.\n", "output": "From Python v2.6.2 documentation - 19.1.11. email: Examples:\n\nHere\u00e2\u0080\u0099s an example of how to create an HTML message with an alternative plain text version:\n\n#! /usr/bin/python\n\nimport smtplib\n\nfrom email.mime.multipart import MIMEMultipart\nfrom email.mime.text import MIMEText\n\n# me == my email address\n# you == recipient's email address\nme = \"my@email.com\"\nyou = \"your@email.com\"\n\n# Create message container - the correct MIME type is multipart/alternative.\nmsg = MIMEMultipart('alternative')\nmsg['Subject'] = \"Link\"\nmsg['From'] = me\nmsg['To'] = you\n\n# Create the body of the message (a plain-text and an HTML version).\ntext = \"Hi!\\nHow are you?\\nHere is the link you wanted:\\nhttp://www.python.org\"\nhtml = \"\"\"\\\n\n \n \n

Hi!
\n How are you?
\n Here is the link you wanted.\n

\n \n\n\"\"\"\n\n# Record the MIME types of both parts - text/plain and text/html.\npart1 = MIMEText(text, 'plain')\npart2 = MIMEText(html, 'html')\n\n# Attach parts into message container.\n# According to RFC 2046, the last part of a multipart message, in this case\n# the HTML message, is best and preferred.\nmsg.attach(part1)\nmsg.attach(part2)\n\n# Send the message via local SMTP server.\ns = smtplib.SMTP('localhost')\n# sendmail function takes 3 arguments: sender's address, recipient's address\n# and message to send - here it is sent as one string.\ns.sendmail(me, you, msg.as_string())\ns.quit()\n\n"} {"input": "Python __str__ versus __unicode__ Is there a python convention for when you should implement __str__() versus __unicode__(). I've seen classes override __unicode__() more frequently than __str__() but it doesn't appear to be consistent. Are there specific rules when it is better to implement one versus the other? Is it necessary/good practice to implement both?\n", "output": "__str__() is the old method -- it returns bytes. __unicode__() is the new, preferred method -- it returns characters. The names are a bit confusing, but in 2.x we're stuck with them for compatibility reasons. Generally, you should put all your string formatting in __unicode__(), and create a stub __str__() method:\ndef __str__(self):\n return unicode(self).encode('utf-8')\n\nIn 3.0, str contains characters, so the same methods are named __bytes__() and __str__(). These behave as expected.\n"} {"input": "Reimport a module in python while interactive I know it can be done, but I never remember how.\nHow can you reimport a module in python? \nThe scenario is as follows: I import a module interactively and tinker with it, but then I face an error. I fix the error in the .py file and then I want to reimport the fixed module without quitting python. How can I do it ?\n", "output": "This should work:\nreload(my.module)\n\nFrom the Python docs\n\nReload a previously imported module. The argument must be a module object, so it must have been successfully imported before. This is useful if you have edited the module source file using an external editor and want to try out the new version without leaving the Python interpreter.\n\n"} {"input": "Django: Why do some model fields clash with each other? I want to create an object that contains 2 links to Users. For example:\nclass GameClaim(models.Model):\n target = models.ForeignKey(User)\n claimer = models.ForeignKey(User)\n isAccepted = models.BooleanField()\n\nbut I am getting the following errors when running the server:\n\n\nAccessor for field 'target' clashes with related field 'User.gameclaim_set'. Add a related_name argument to the definition for 'target'.\n\n\nAccessor for field 'claimer' clashes with related field 'User.gameclaim_set'. Add a related_name argument to the definition for 'claimer'.\n\n\nCan you please explain why I am getting the errors and how to fix them?\n", "output": "You have two foreign keys to User. Django automatically creates a reverse relation from User back to GameClaim, which is usually gameclaim_set. However, because you have two FKs, you would have two gameclaim_set attributes, which is obviously impossible. So you need to tell Django what name to use for the reverse relation.\nUse the related_name attribute in the FK definition. e.g.\nclass GameClaim(models.Model):\n target = models.ForeignKey(User, related_name='gameclaim_targets')\n claimer = models.ForeignKey(User, related_name='gameclaim_users')\n isAccepted = models.BooleanField()\n\n"} {"input": "numpy: access an array by column Suppose I have:\ntest = numpy.array([[1, 2], [3, 4], [5, 6]])\n\ntest[i] gets me ith line of the array (eg [1, 2]). How can I access the ith column? (eg [1, 3, 5]). Also, would this be an expensive operation?\n", "output": ">>> test[:,0]\narray([1, 3, 5])\n\nSimilarly, \n>>> test[1,:]\narray([3, 4])\n\nlets you access rows. This is covered in Section 1.4 (Indexing) of the NumPy reference. This is quick, at least in my experience. It's certainly much quicker than accessing each element in a loop.\n"} {"input": "how to get row count of pandas dataframe? I'm trying to get the number of rows of dataframe df with Pandas, and here is my code\nMethod 1:\ntotal_rows = df.count\nprint total_rows +1\n\nMethod 2:\ntotal_rows = df['First_columnn_label'].count\nprint total_rows +1\n\nboth the code snippets give me this error: \n\nTypeError: unsupported operand type(s) for +: 'instancemethod' and 'int'\n\nI'd be grateful for any suggestions what I'm doing wrong.\nEDIT: According to the answer given by root the best (the fastest) way to check df length is to call:\nlen(df.index)\n\n", "output": "You can use the .shape property or just len(DataFrame.index) as there are notable performance differences:\nIn [1]: import numpy as np\n\nIn [2]: import pandas as pd\n\nIn [3]: df = pd.DataFrame(np.arange(9).reshape(3,3))\n\nIn [4]: df\nOut[4]: \n 0 1 2\n0 0 1 2\n1 3 4 5\n2 6 7 8\n\nIn [5]: df.shape\nOut[5]: (3, 3)\n\nIn [6]: timeit df.shape\n1000000 loops, best of 3: 1.17 us per loop\n\nIn [7]: timeit df[0].count()\n10000 loops, best of 3: 56 us per loop\n\nIn [8]: len(df.index)\nOut[8]: 3\n\nIn [9]: timeit len(df.index)\n1000000 loops, best of 3: 381 ns per loop\n\n\nEDIT: As @Dan Allen noted in the comments len(df.index) and df[0].count() are not interchangeable as count excludes NaNs,\n"} {"input": "How to check version of python modules? I just installed the python modules: construct and statlib with setuptools like this:\n# Install setuptools to be able to download the following\nsudo apt-get install python-setuptools\n\n# Install statlib for lightweight statistical tools\nsudo easy_install statlib\n\n# Install construct for packing/unpacking binary data\nsudo easy_install construct\n\nI want to be able to (programmatically) check their versions. Is there an equivalent to python --version I can run from the command line?\nMy python version is 2.7.3.\n", "output": "I suggest using pip in place of easy_install. With pip, you can list all installed packages and their versions with \npip freeze\n\nFor an individual module, you can try __version__ attribute, however there are modules without it:\n$ pip freeze | grep lxml\nlxml==2.3\n$ python -c \"import lxml; print lxml.__version__\"\nTraceback (most recent call last):\n File \"\", line 1, in \nAttributeError: 'module' object has no attribute '__version__'\n\nLast, as you run your command with sudo prefix, I guess you're installing to global python environment. Strongly advise to take look into python virtual environment managers, for example virtualenvwrapper\n"} {"input": "Using Django time/date widgets in custom form How can I use the nifty JavaScript date and time widgets that the default admin uses with my custom view?\nI have looked through the Django forms documentation, and it briefly mentions django.contrib.admin.widgets, but I don't know how to use it?\nHere is my template that I want it applied on.\n
\n \n {% for f in form %}\n \n {% endfor %}\n
{{ f.name }} {{ f }}
\n \n
\n\nAlso, I think it should be noted that I haven't really written a view up myself for this form, I am using a generic view. Here is the entry from the url.py:\n(r'^admin/products/add/$', create_object, {'model': Product, 'post_save_redirect': ''}),\n\nAnd I am relevantly new to the whole Django/MVC/MTV thing, so please go easy...\n", "output": "The growing complexity of this answer over time, and the many hacks required, probably ought to caution you against doing this at all. It's relying on undocumented internal implementation details of the admin, is likely to break again in future versions of Django, and is no easier to implement than just finding another JS calendar widget and using that.\nThat said, here's what you have to do if you're determined to make this work:\n\nDefine your own ModelForm subclass for your model (best to put it in forms.py in your app), and tell it to use the AdminDateWidget / AdminTimeWidget / AdminSplitDateTime (replace 'mydate' etc with the proper field names from your model):\nfrom django import forms\nfrom my_app.models import Product\nfrom django.contrib.admin import widgets \n\nclass ProductForm(forms.ModelForm):\n class Meta:\n model = Product\n def __init__(self, *args, **kwargs):\n super(ProductForm, self).__init__(*args, **kwargs)\n self.fields['mydate'].widget = widgets.AdminDateWidget()\n self.fields['mytime'].widget = widgets.AdminTimeWidget()\n self.fields['mydatetime'].widget = widgets.AdminSplitDateTime()\n\nChange your URLconf to pass 'form_class': ProductForm instead of 'model': Product to the generic create_object view (that'll mean \"from my_app.forms import ProductForm\" instead of \"from my_app.models import Product\", of course).\nIn the head of your template, include {{ form.media }} to output the links to the Javascript files.\nAnd the hacky part: the admin date/time widgets presume that the i18n JS stuff has been loaded, and also require core.js, but don't provide either one automatically. So in your template above {{ form.media }} you'll need:\n\n\n\nYou may also wish to use the following admin CSS (thanks Alex for mentioning this):\n\n\n\n\n\n\nThis implies that Django's admin media (ADMIN_MEDIA_PREFIX) is at /media/admin/ - you can change that for your setup. Ideally you'd use a context processor to pass this values to your template instead of hardcoding it, but that's beyond the scope of this question.\nThis also requires that the URL /my_admin/jsi18n/ be manually wired up to the django.views.i18n.javascript_catalog view (or null_javascript_catalog if you aren't using I18N). You have to do this yourself instead of going through the admin application so it's accessible regardless of whether you're logged into the admin (thanks Jeremy for pointing this out). Sample code for your URLconf:\n(r'^my_admin/jsi18n', 'django.views.i18n.javascript_catalog'),\n\nLastly, if you are using Django 1.2 or later, you need some additional code in your template to help the widgets find their media:\n{% load adminmedia %} /* At the top of the template. */\n\n/* In the head section of the template. */\n\n\nThanks lupefiasco for this addition.\n"} {"input": "How do I convert local time to UTC in Python? How do I convert a datetime string in local time to a string in UTC time?\nI'm sure I've done this before, but can't find it and SO will hopefully help me (and others) do that in future.\nClarification: For example, if I have 2008-09-17 14:02:00 in my local timezone (+10), I'd like to generate a string with the equivalent UTC time: 2008-09-17 04:02:00.\nAlso, from http://lucumr.pocoo.org/2011/7/15/eppur-si-muove/, note that in general this isn't possible as with DST and other issues there is no unique conversion from local time to UTC time.\n", "output": "First, parse the string into a naive datetime object. This is an instance of datetime.datetime with no attached timezone information. See documentation for datetime.strptime for information on parsing the date string.\nUse the pytz module, which comes with a full list of time zones + UTC. Figure out what the local timezone is, construct a timezone object from it, and manipulate and attach it to the naive datetime.\nFinally, use datetime.astimezone() method to convert the datetime to UTC.\nSource code, using local timezone \"America/Los_Angeles\", for the string \"2001-2-3 10:11:12\":\nimport pytz, datetime\nlocal = pytz.timezone (\"America/Los_Angeles\")\nnaive = datetime.datetime.strptime (\"2001-2-3 10:11:12\", \"%Y-%m-%d %H:%M:%S\")\nlocal_dt = local.localize(naive, is_dst=None)\nutc_dt = local_dt.astimezone (pytz.utc)\n\nFrom there, you can use the strftime() method to format the UTC datetime as needed:\nutc_dt.strftime (\"%Y-%m-%d %H:%M:%S\")\n\n"} {"input": "Choosing Java vs Python on Google App Engine Currently Google App Engine supports both Python & Java. Java support is less mature. However, Java seems to have a longer list of libraries and especially support for Java bytecode regardless of the languages used to write that code. Which language will give better performance and more power? Please advise. Thank you!\nEdit:\nhttp://groups.google.com/group/google-appengine-java/web/will-it-play-in-app-engine?pli=1\nEdit:\nBy \"power\" I mean better expandability and inclusion of available libraries outside the framework. Python allows only pure Python libraries, though.\n", "output": "I'm biased (being a Python expert but pretty rusty in Java) but I think the Python runtime of GAE is currently more advanced and better developed than the Java runtime -- the former has had one extra year to develop and mature, after all.\nHow things will proceed going forward is of course hard to predict -- demand is probably stronger on the Java side (especially since it's not just about Java, but other languages perched on top of the JVM too, so it's THE way to run e.g. PHP or Ruby code on App Engine); the Python App Engine team however does have the advantage of having on board Guido van Rossum, the inventor of Python and an amazingly strong engineer.\nIn terms of flexibility, the Java engine, as already mentioned, does offer the possibility of running JVM bytecode made by different languages, not just Java -- if you're in a multi-language shop that's a pretty large positive. Vice versa, if you loathe Javascript but must execute some code in the user's browser, Java's GWT (generating the Javascript for you from your Java-level coding) is far richer and more advanced than Python-side alternatives (in practice, if you choose Python, you'll be writing some JS yourself for this purpose, while if you choose Java GWT is a usable alternative if you loathe writing JS).\nIn terms of libraries it's pretty much a wash -- the JVM is restricted enough (no threads, no custom class loaders, no JNI, no relational DB) to hamper the simple reuse of existing Java libraries as much, or more, than existing Python libraries are similarly hampered by the similar restrictions on the Python runtime.\nIn terms of performance, I think it's a wash, though you should benchmark on tasks of your own -- don't rely on the performance of highly optimized JIT-based JVM implementations discounting their large startup times and memory footprints, because the app engine environment is very different (startup costs will be paid often, as instances of your app are started, stopped, moved to different hosts, etc, all trasparently to you -- such events are typically much cheaper with Python runtime environments than with JVMs).\nThe XPath/XSLT situation (to be euphemistic...) is not exactly perfect on either side, sigh, though I think it may be a tad less bad in the JVM (where, apparently, substantial subsets of Saxon can be made to run, with some care). I think it's worth opening issues on the Appengine Issues page with XPath and XSLT in their titles -- right now there are only issues asking for specific libraries, and that's myopic: I don't really care HOW a good XPath/XSLT is implemented, for Python and/or for Java, as long as I get to use it. (Specific libraries may ease migration of existing code, but that's less important than being able to perform such tasks as \"rapidly apply XSLT transformation\" in SOME way!-). I know I'd star such an issue if well phrased (especially in a language-independent way).\nLast but not least: remember that you can have different version of your app (using the same datastore) some of which are implemented with the Python runtime, some with the Java runtime, and you can access versions that differ from the \"default/active\" one with explicit URLs. So you could have both Python and Java code (in different versions of your app) use and modify the same data store, granting you even more flexibility (though only one will have the \"nice\" URL such as foobar.appspot.com -- which is probably important only for access by interactive users on browsers, I imagine;-).\n"} {"input": "Using sphinx with Markdown instead of RST I hate RST but love sphinx. Is there a way that sphinx reads markdown instead of reStructuredText?\n", "output": "The \"proper\" way to do that would be to write a docutils parser for markdown. (Plus a Sphinx option to choose the parser.) The beauty of this would be instant support for all docutils output formats (but you might not care about that, as similar markdown tools already exist for most). Ways to approach that without developing a parser from scratch:\n\nYou could cheat and write a \"parser\" that uses Pandoc to convert markdown to RST and pass that to the RST parser :-).\nYou can use an existing markdown->XML parser and transform the result (using XSLT?) to the docutils schema.\nYou could take some existing python markdown parser that lets you define a custom renderer and make it build docutils node tree.\nYou could fork the existing RST reader, ripping out everything irrelevant to markdown and changing the different syntaxes (this comparison might help)...\nEDIT: I don't recommend this route unless you're prepared to heavily test it. Markdown already has too many subtly different dialects and this will likely result in yet-another-one...\n\nUPDATE: https://github.com/sgenoud/remarkdown is a markdown reader for docutils. It didn't take any of the above shortcuts but uses a Parsley PEG grammar inspired by peg-markdown. Doesn't yet support directives.\nUPDATE: https://github.com/rtfd/recommonmark and is another docutils reader, natively supported on ReadTheDocs. Derived from remarkdown but uses the CommonMark-py parser. Doesn't support directives, but can convert more or less natural Markdown syntaxes to appropriate structures e.g. list of links to a toctree. For other needs, an ```eval_rst fenced block lets you embed any rST directive.\n\nIn all cases, you'll need to invent extensions of Markdown to represent Sphinx directives and roles. While you may not need all of them, some like .. toctree:: are essential.\nThis I think is the hardest part. reStructuredText before the Sphinx extensions was already richer than markdown. Even heavily extended markdown, such as pandoc's, is mostly a subset of rST feature set. That's a lot of ground to cover!\nImplementation-wise, the easiest thing is adding a generic construct to express any docutils role/directive. The obvious candidates for syntax inspiration are:\n\nAttribute syntax, which pandoc and some other implementations already allow on many inline and block constructs. For example `foo`{.method} -> `foo`:method:.\nHTML/XML. From foo to the kludgiest approach of just inserting docutils internal XML!\nSome kind of YAML for directives?\n\nBut such a generic mapping will not be the most markdown-ish solution...\nCurrently most active places to discuss markdown extensions are https://groups.google.com/forum/#!topic/pandoc-discuss, https://github.com/scholmd/scholmd/\nThis also means you can't just reuse a markdown parser without extending it somehow. Pandoc again lives up to its reputation as the swiss army knife of document conversion by supporting custom filtes. (In fact, if I were to approach this I'd try to build a generic bridge between docutils readers/transformers/writers and pandoc readers/filters/writers. It's more than you need but the payoff would be much wider than just sphinx/markdown.)\n\nAlternative crazy idea: instead of extending markdown to handle Sphinx, extend reStructuredText to support (mostly) a superset of markdown! The beauty is you'll be able to use any Sphinx features as-is, yet be able to write most content in markdown.\nThere is already considerable syntax overlap; most notably link syntax is incompatible. I think if you add support to RST for markdown links, and ###-style headers, and change default `backticks` role to literal, and maybe change indented blocks to mean literal (RST supports > ... for quotations nowdays), you'll get something usable that supports most markdown.\n"} {"input": "Django datetime issues (default=datetime.now()) I have such db model:\nfrom datetime import datetime \n\nclass TermPayment(models.Model):\n # cut out some fields, non relevant to the question\n date = models.DateTimeField(default=datetime.now(), blank=True)\n\nAnd when new instance is added:\ntp = TermPayment.objects.create(**kwargs)\n\nI've an issue: all records in database have the same value in date field - the date of the first payment. After server restart - one record have new date and others have the same as the first. It's look like some data cache is used but I can't find where.\ndatabase: mysql 5.1.25\ndjango v1.1.1\n", "output": "it looks like datetime.now() is being evaluated when the model is defined, and not each time you add a record.\nDjango has a feature to accomplish what you are trying to do already:\ndate = models.DateTimeField(auto_now_add=True, blank=True)\n\nor\ndate = models.DateTimeField(default=datetime.now, blank=True)\n\nThe difference between the second example and what you currently have is the lack of parentheses. By passing datetime.now without the parentheses, you are passing the actual function, which will be called each time a record is added. If you pass it datetime.now(), then you are just evaluating the function and passing it the return value.\nMore information is available at Django's model field reference\n"} {"input": "Multiple variables in Python 'with' statement Is it possible to declare more than one variable using a with statement in Python?\nSomething like:\nfrom __future__ import with_statement\n\nwith open(\"out.txt\",\"wt\"), open(\"in.txt\") as file_out, file_in:\n for line in file_in:\n file_out.write(line)\n\n... or is cleaning up two resources at the same time the problem?\n", "output": "It is possible in Python 3 since v3.1 and Python 2.7. The new with syntax supports multiple context managers:\nwith A() as a, B() as b, C() as c:\n doSomething(a,b,c)\n\nUnlike the contextlib.nested, this guarantees that a and b will have their __exit__()'s called even if C() or it's __enter__() method raises an exception.\n"} {"input": "Python date string to date object How do I convert a string to a date object in python? \nThe string would be: \"24052010\" (corresponding to the format: \"%d%m%Y\")\nI don't want a datetime.datetime object, but rather a datetime.date\n", "output": "You can use strptime in the datetime package of Python:\n>>> datetime.datetime.strptime('24052010', \"%d%m%Y\").date()\ndatetime.date(2010, 5, 24)\n\n"} {"input": "How are Python's Built In Dictionaries Implemented Does anyone know how the built in dictionary type for python is implemented? My understanding is that it is some sort of hash table, but I haven't been able to find any sort of definitive answer.\n", "output": "Here is everything about Python dicts that I was able to put together (probably more than anyone would like to know; but the answer is comprehensive). \n\nPython dictionaries are implemented as hash tables.\nHash tables must allow for hash collisions i.e. even if two distinct keys have the same hash value, the table's implementation must have a strategy to insert and retrieve the key and value pairs unambiguously.\nPython dict uses open addressing to resolve hash collisions (explained below) (see dictobject.c:296-297).\nPython hash table is just a contiguous block of memory (sort of like an array, so you can do an O(1) lookup by index). \nEach slot in the table can store one and only one entry. This is important.\nEach entry in the table actually a combination of the three values: < hash, key, value >. This is implemented as a C struct (see dictobject.h:51-56).\nThe figure below is a logical representation of a Python hash table. In the figure below, 0, 1, ..., i, ... on the left are indices of the slots in the hash table (they are just for illustrative purposes and are not stored along with the table obviously!).\n# Logical model of Python Hash table\n-+-----------------+\n0| |\n-+-----------------+\n1| ... |\n-+-----------------+\n.| ... |\n-+-----------------+\ni| ... |\n-+-----------------+\n.| ... |\n-+-----------------+\nn| ... |\n-+-----------------+\n\nWhen a new dict is initialized it starts with 8 slots. (see dictobject.h:49)\nWhen adding entries to the table, we start with some slot, i, that is based on the hash of the key. CPython initially uses i = hash(key) & mask (where mask = PyDictMINSIZE - 1, but that's not really important). Just note that the initial slot, i, that is checked depends on the hash of the key.\nIf that slot is empty, the entry is added to the slot (by entry, I mean, ). But what if that slot is occupied!? Most likely because another entry has the same hash (hash collision!)\nIf the slot is occupied, CPython (and even PyPy) compares the the hash AND the key (by compare I mean == comparison not the is comparison) of the entry in the slot against the key of the current entry to be inserted (dictobject.c:337,344-345). If both match, then it thinks the entry already exists, gives up and moves on to the next entry to be inserted. If either hash or the key don't match, it starts probing. \nProbing just means it searches the slots by slot to find an empty slot. Technically we could just go one by one, i+1, i+2, ... and use the first available one (that's linear probing). But for reasons explained beautifully in the comments (see dictobject.c:33-126), CPython uses random probing. In random probing, the next slot is picked in a pseudo random order. The entry is added to the first empty slot. For this discussion, the actual algorithm used to pick the next slot is not really important (see dictobject.c:33-126 for the algorithm for probing). What is important is that the slots are probed until first empty slot is found.\nThe same thing happens for lookups, just starts with the initial slot i (where i depends on the hash of the key). If the hash and the key both don't match the entry in the slot, it starts probing, until it finds a slot with a match. If all slots are exhausted, it reports a fail.\nBTW, the dict will be resized if it is two-thirds full. This avoids slowing down lookups. (see dictobject.h:64-65)\n\nNOTE: I did the research on Python Dict implementation in response to my own question about how multiple entries in a dict can have same hash values. I posted a slightly edited version of the response here because all the research is very relevant for this question as well.\n"} {"input": "How to iterate over rows in a DataFrame? I have a DataFrames from pandas:\nimport pandas as pd\ninp = [{'c1':10, 'c2':100}, {'c1':11,'c2':110}, {'c1':12,'c2':120}]\ndf = pd.DataFrame(inp)\nprint df\n\nOutput:\n c1 c2\n0 10 100\n1 11 110\n2 12 120\n\nNow I want to iterate over the rows of the above frame. For every row I want to be able to access its elements (values in cells) by the name of the columns. So, for example, I would like to have something like that:\nfor row in df.rows:\n print row['c1'], row['c2']\n\nIs it possible to do that in pandas?\nI found similar question. But it does not give me the answer I need. For example, it is suggested there to use:\nfor date, row in df.T.iteritems():\n\nor\nfor row in df.iterrows():\n\nBut I do not understand what the row object is and how I can work with it.\n", "output": "iterrows is a generator which yield both index and row\nIn [18]: for index, row in df.iterrows():\n ....: print row['c1'], row['c2']\n ....: \n10 100\n11 110\n12 120\n\n"} {"input": "Split python string every nth character? \nPossible Duplicate:\nWhat is the most \u201cpythonic\u201d way to iterate over a list in chunks? \n\nIs it possible to split a python string every nth character?\nFor example, suppose I have a string containing the following:\n'1234567890'\n\nHow can I get it to look like this:\n['12','34','56','78','90']\n\n", "output": ">>> line = '1234567890'\n>>> n = 2\n>>> [line[i:i+n] for i in range(0, len(line), n)]\n['12', '34', '56', '78', '90']\n\n"} {"input": "Read password from stdin Scenario: An interactive CLI Python program, that is in need for a password. That means also, there's no GUI solution possible.\nIn bash I could get a password read in without re-prompting it on screen via\nread -s\n\nIs there something similar for Python? I.e.,\npassword = raw_input('Password: ', dont_print_statement_back_to_screen)\n\nAlternative: Replace the typed characters with '*' before sending them back to screen (aka browser' style).\n", "output": ">>> import getpass\n>>> pw = getpass.getpass()\n\n"} {"input": "How to filter the DataFrame rows of pandas by \"within\"/\"in\"? I have a Python pandas DataFrame rpt:\nrpt\n\nMultiIndex: 47518 entries, ('000002', '20120331') to ('603366', '20091231')\nData columns:\nSTK_ID 47518 non-null values\nSTK_Name 47518 non-null values\nRPT_Date 47518 non-null values\nsales 47518 non-null values\n\nI can filter the rows whose stock id is '600809' like this: rpt[rpt['STK_ID']=='600809']\n\nMultiIndex: 25 entries, ('600809', '20120331') to ('600809', '20060331')\nData columns:\nSTK_ID 25 non-null values\nSTK_Name 25 non-null values\nRPT_Date 25 non-null values\nsales 25 non-null values\n\nand I want to get all the rows of some stocks together, such as ['600809','600141','600329'], that means I want a syntax like this: \nstk_list = ['600809','600141','600329']\n\nrst = rpt[rpt['STK_ID'] in stk_list] ### this does not works in pandas \n\nSince pandas not accept above command, how to achieve the target? \n", "output": "Use the isin method. rpt[rpt['STK_ID'].isin(stk_list)].\n"} {"input": "How to serve static files in Flask So this is embarrassing. I've got an application that I threw together in Flask and for now it is just serving up a single static HTML page with some links to CSS and JS. And I can't find where in the documentation Flask describes returning static files. Yes, I could use render_template but I know the data is not templatized. I'd have thought send_file or url_for was the right thing, but I could not get those to work. In the meantime, I am opening the files, reading content, and rigging up a Response with appropriate mimetype:\nimport os.path\n\nfrom flask import Flask, Response\n\n\napp = Flask(__name__)\napp.config.from_object(__name__)\n\n\ndef root_dir(): # pragma: no cover\n return os.path.abspath(os.path.dirname(__file__))\n\n\ndef get_file(filename): # pragma: no cover\n try:\n src = os.path.join(root_dir(), filename)\n # Figure out how flask returns static files\n # Tried:\n # - render_template\n # - send_file\n # This should not be so non-obvious\n return open(src).read()\n except IOError as exc:\n return str(exc)\n\n\n@app.route('/', methods=['GET'])\ndef metrics(): # pragma: no cover\n content = get_file('jenkins_analytics.html')\n return Response(content, mimetype=\"text/html\")\n\n\n@app.route('/', defaults={'path': ''})\n@app.route('/')\ndef get_resource(path): # pragma: no cover\n mimetypes = {\n \".css\": \"text/css\",\n \".html\": \"text/html\",\n \".js\": \"application/javascript\",\n }\n complete_path = os.path.join(root_dir(), path)\n ext = os.path.splitext(path)[1]\n mimetype = mimetypes.get(ext, \"text/html\")\n content = get_file(complete_path)\n return Response(content, mimetype=mimetype)\n\n\nif __name__ == '__main__': # pragma: no cover\n app.run(port=80)\n\nSomeone want to give a code sample or url for this? I know this is going to be dead simple.\n", "output": "The preferred method is to use nginx or another web server to serve static files; they'll be able to do it more efficiently than Flask.\nHowever, you can use send_from_directory to send files from a directory, which can be pretty convenient in some situations:\nfrom flask import Flask, request, send_from_directory\n\n# set the project root directory as the static folder, you can set others.\napp = Flask(__name__, static_url_path='')\n\n@app.route('/js/')\ndef send_js(path):\n return send_from_directory('js', path)\n\nif __name__ == \"__main__\":\n app.run()\n\nDo not use send_file or send_static_file with an user-supplied path.\nsend_static_file example:\nfrom flask import Flask, request\n# set the project root directory as the static folder, you can set others.\napp = Flask(__name__, static_url_path='')\n\n@app.route('/')\ndef root():\n return app.send_static_file('index.html')\n\n"} {"input": "Is a Python list guaranteed to have its elements stay in the order they are inserted in? If I have the following Python code\n>>> x = []\n>>> x = x + [1]\n>>> x = x + [2]\n>>> x = x + [3]\n>>> x\n[1, 2, 3]\n\nWill x be guaranteed to always be [1,2,3], or are other orderings of the interim elements possible?\n", "output": "Yes, the order of elements in a python list is persistent.\n"} {"input": "How does collections.defaultdict work? I've read the examples in python docs, but still can't figure out what this method means. Can somebody help? Here are two examples from the python docs\n>>> s = 'mississippi'\n>>> d = defaultdict(int)\n>>> for k in s:\n... d[k] += 1\n...\n>>> d.items()\n[('i', 4), ('p', 2), ('s', 4), ('m', 1)]\n\nand\n>>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]\n>>> d = defaultdict(list)\n>>> for k, v in s:\n... d[k].append(v)\n...\n>>> d.items()\n[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]\n\nthe parameters int and list are for what? \n", "output": "Usually, a Python dictionary throws a KeyError if you try to get an item with a key that is not currently in the dictionary. The defaultdict in contrast will simply create any items that you try to access (provided of course they do not exist yet). To create such a \"default\" item, it calls the function object that you pass in the constructor (more precisely, it's an arbitrary \"callable\" object, which includes function and type objects). For the first example, default items are created using int(), which will return the integer object 0. For the second example, default items are created using list(), which returns a new empty list object.\n"} {"input": "Python: Extract numbers from a string I would extract all the numbers contained in a string. Which is the better suited for the purpose, regular expressions or the isdigit() method?\nExample:\nline = \"hello 12 hi 89\"\n\nResult:\n[12, 89]\n\n", "output": "I'd use a regexp :\n>>> import re\n>>> re.findall(r'\\d+', 'hello 42 I\\'m a 32 string 30')\n['42', '32', '30']\n\nThis would also match 42 from bla42bla. If you only want numbers delimited by word boundaries (space, period, comma), you can use \\b :\n>>> re.findall(r'\\b\\d+\\b', 'he33llo 42 I\\'m a 32 string 30')\n['42', '32', '30']\n\nTo end up with a list of numbers instead of a list of strings:\n>>> [int(s) for s in re.findall(r'\\b\\d+\\b', 'he33llo 42 I\\'m a 32 string 30')]\n[42, 32, 30]\n\n"} {"input": "Print the full numpy array When I print a numpy array, I get a truncated representation, but I want the full array. \nIs there any way to do this?\nExamples:\n>>> numpy.arange(10000)\narray([ 0, 1, 2, ..., 9997, 9998, 9999])\n>>> numpy.arange(10000).reshape(250,40)\narray([[ 0, 1, 2, ..., 37, 38, 39],\n [ 40, 41, 42, ..., 77, 78, 79],\n [ 80, 81, 82, ..., 117, 118, 119],\n ..., \n [9880, 9881, 9882, ..., 9917, 9918, 9919],\n [9920, 9921, 9922, ..., 9957, 9958, 9959],\n [9960, 9961, 9962, ..., 9997, 9998, 9999]])\n\n", "output": "To clarify on Reed's reply\nimport numpy\nnumpy.set_printoptions(threshold=numpy.nan)\n\nNote that the reply as given above works with an initial 'from numpy import *', which is not advisable. \nThis also works for me\nnumpy.set_printoptions(threshold='nan')\n\nFor full documentation, see http://docs.scipy.org/doc/numpy/reference/generated/numpy.set_printoptions.html. \n"} {"input": "Why is IoC / DI not common in Python? In Java IoC / DI is a very common practice which is extensively used in web applications, nearly all available frameworks and Java EE. On the other hand, there are also lots of big Python web applications, but beside of Zope (which I've heard should be really horrible to code) IoC doesn't seem to be very common in the Python world. (Please name some examples if you think that I'm wrong).\nThere are of course several clones of popular Java IoC frameworks available for Python, springpython for example. But none of them seems to get used practically. At least, I've never stumpled upon a Django or sqlalchemy+ based web application which uses something like that.\nIn my opinion IoC has reasonable advantages and would make it easy to replace the django-default-user-model for example, but extensive usage of interface classes and IoC in Python looks a bit odd and not \u00c2\u00bbpythonic\u00c2\u00ab. But maybe someone has a better explanation, why IoC isn't widely used in Python.\n", "output": "I don't actually think that DI/IoC are that uncommon in Python. What is uncommon, however, are DI/IoC frameworks/containers.\nThink about it: what does a DI container do? It allows you to \n\nwire together independent components into a complete application ...\n... at runtime.\n\nWe have names for \"wiring together\" and \"at runtime\":\n\nscripting\ndynamic\n\nSo, a DI container is nothing but an interpreter for a dynamic scripting language. Actually, let me rephrase that: a typical Java/.NET DI container is nothing but a crappy interpreter for a really bad dynamic scripting language with butt-ugly, often XML-based, syntax.\nWhen you program in Python, why would you want to use an ugly, bad scripting language when you have a beautiful, brilliant scripting language at your disposal? Actually, that's a more general question: when you program in pretty much any language, why would you want to use an ugly, bad scripting language when you have Jython and IronPython at your disposal?\nSo, to recap: the practice of DI/IoC is just as important in Python as it is in Java, for exactly the same reasons. The implementation of DI/IoC however, is built into the language and often so lightweight that it completely vanishes. \n(Here's a brief aside for an analogy: in assembly, a subroutine call is a pretty major deal - you have to save your local variables and registers to memory, save your return address somewhere, change the instruction pointer to the subroutine you are calling, arrange for it to somehow jump back into your subroutine when it is finished, put the arguments somewhere where the callee can find them, and so on. IOW: in assembly, \"subroutine call\" is a Design Pattern, and before there were languages like Fortran which had subroutine calls built in, people were building their own \"subroutine frameworks\". Would you say that subroutine calls are \"uncommon\" in Python, just because you don't use subroutine frameworks?)\nBTW: for an example of what it looks like to take DI to its logical conclusion, take a look at Gilad Bracha's Newspeak Programming Language and his writings on the subject:\n\nConstructors Considered Harmful\nLethal Injection\nA Ban on Imports (continued)\n\n"} {"input": "Best way to log a Python exception I'm printing my exceptions to a log file currently with:\ntry:\n # coode in here\nexcept Exception, e:\n logging.error(e)\n\nCould I be printing more information about the exception and the code that generated it than just the exception string? Things like line numbers or stack traces would be great.\n", "output": "logger.exception does that. \nFor example:\nimport logging\ntry:\n 1/0\nexcept Exception as e:\n logging.exception(\"message\")\n\nOutput:\nERROR:root:message\nTraceback (most recent call last):\n File \"\", line 2, in \nZeroDivisionError: integer division or modulo by zero\n\n@Paulo Cheque notes, \"be aware that in Python 3 you must call the logging.exception method just inside the except part. If you call this method in an arbitrary place you may get a bizarre exception. The docs alert about that.\"\n"} {"input": "Why wasn't PyPy included in standard Python? I was looking at PyPy and I was just wondering why it hasn't been adopted into the mainline Python distributions. Wouldn't things like JIT compilation and lower memory footprint greatly improve the speeds of all Python code?\nIn short, what are the main drawbacks of PyPy that cause it to remain a separate project?\n", "output": "PyPy is not a fork of CPython, so it could never be merged directly into CPython.\nTheoretically the Python community could universally adopt PyPy, PyPy could be made the reference implementation, and CPython could be discontinued. However, PyPy has its own weaknesses:\n\nCPython is easy to integrate with Python modules written in C, which is traditionally the way Python applications have handled CPU-intensive tasks (see for instance the SciPy project).\nThe PyPy JIT compilation step itself costs CPU time -- it's only through repeated running of compiled code that it becomes faster overall. This means startup times can be higher, and therefore PyPy isn't necessarily as efficient for running glue code or trivial scripts.\nPyPy and CPython behavior is not identical in all respects, especially when it comes to \"implementation details\" (behavior that is not specified by the language but is still important at a practical level).\nCPython runs on more architectures than PyPy and has been successfully adapted to run in embedded architectures in ways that may be impractical for PyPy.\nCPython's reference counting scheme for memory management arguably has more predictable performance impacts than PyPy's various GC systems, although this isn't necessarily true of all \"pure GC\" strategies.\nPyPy does not yet fully support Python 3.x, although that is an active work item.\n\nPyPy is a great project, but runtime speed on CPU-intensive tasks isn't everything, and in many applications it's the least of many concerns. For instance, Django can run on PyPy and that makes templating faster, but CPython's database drivers are faster than PyPy's; in the end, which implementation is more efficient depends on where the bottleneck in a given application is.\nAnother example: you'd think PyPy would be great for games, but most GC strategies like those used in PyPy cause noticeable jitter. For CPython, most of the CPU-intensive game stuff is offloaded to the PyGame library, which PyPy can't take advantage of since PyGame is primarily implemented as a C extension (though see: pygame-cffi). I still think PyPy can be a great platform for games, but I've never seen it actually used.\nPyPy and CPython have radically different approaches to fundamental design questions and make different tradeoffs, so neither one is \"better\" than the other in every case.\n"} {"input": "Adding 5 days to a date in Python I have a date \"10/10/11(m-d-y)\" and I want to add 5 days to it using a Python script. Please consider a general solution that works on the month ends also.\nI am using following code:\nimport re\nfrom datetime import datetime\n\nStartDate = \"10/10/11\"\n\nDate = datetime.strptime(StartDate, \"%m/%d/%y\")\n\nprint Date -> is printing '2011-10-10 00:00:00'\nNow I want to add 5 days to this date. I used the following code:\nEndDate = Date.today()+timedelta(days=10)\n\nWhich returned this error:\nname 'timedelta' is not defined\n\n", "output": "The previous answers are correct but it's generally a better practice to do:\nimport datetime\n\nThen you'll have, using datetime.timedelta:\ndate_1 = datetime.datetime.strptime(start_date, \"%m/%d/%y\")\n\nend_date = date_1 + datetime.timedelta(days=10)\n\n"} {"input": "How to write a Python module? I've been making Python scripts for simple tasks at work and never really bothered packaging them for others to use. Now I have been assigned to make a Python wrapper for a REST API. I have absolutely no idea on how to start and I need help.\nWhat I have:\n(Just want to be specific as possible) I have the virtualenv ready, it's also up in github, the .gitignore file for python is there as well, plus, the requests library for interacting with the REST API. That's it.\nHere's the current directory tree\n.\n\u00e2\u0094\u009c\u00e2\u0094\u0080\u00e2\u0094\u0080 bin\n\u00e2\u0094\u0082\u00c2\u00a0\u00c2\u00a0 \u00e2\u0094\u0094\u00e2\u0094\u0080\u00e2\u0094\u0080 /the usual stuff/\n\u00e2\u0094\u009c\u00e2\u0094\u0080\u00e2\u0094\u0080 include\n\u00e2\u0094\u0082\u00c2\u00a0\u00c2\u00a0 \u00e2\u0094\u0094\u00e2\u0094\u0080\u00e2\u0094\u0080 /the usual stuff/\n\u00e2\u0094\u009c\u00e2\u0094\u0080\u00e2\u0094\u0080 lib\n\u00e2\u0094\u0082\u00c2\u00a0\u00c2\u00a0 \u00e2\u0094\u0094\u00e2\u0094\u0080\u00e2\u0094\u0080 python2.7\n\u00e2\u0094\u0082\u00c2\u00a0\u00c2\u00a0 \u00e2\u0094\u0094\u00e2\u0094\u0080\u00e2\u0094\u0080 /the usual stuff/\n\u00e2\u0094\u009c\u00e2\u0094\u0080\u00e2\u0094\u0080 local\n\u00e2\u0094\u0082\u00c2\u00a0\u00c2\u00a0 \u00e2\u0094\u0094\u00e2\u0094\u0080\u00e2\u0094\u0080 /the usual stuff/\n\u00e2\u0094\u0094\u00e2\u0094\u0080\u00e2\u0094\u0080 README.md\n\n27 directories, 280 files\n\nI don't even know where to put the .py files, if I ever make one.\nWhat I wanted to do:\nMake a python module install-able with \"pip install ...\"\nIf possible, I want a general step by step process on writing Python modules.\n", "output": "A module is a file containing Python definitions and statements. The file name is the module name with the suffix .py\ncreate hello.py then write the following function as its content:\ndef helloworld():\n print \"hello\"\n\nThen you can import hello:\n>>> import hello\n>>> hello.helloworld()\n'hello'\n>>>\n\nTo group many .py files put them in a folder. Any folder with an __init__.py is considered a module by python and you can call them a package\n|-HelloModule\n |_ __init__.py\n |_ hellomodule.py\n\nYou can go about with the import statement on your module the usual way.\nFor more information, see 6.4. Packages.\n"} {"input": "Python list of dictionaries search Assume I have this:\n[\n{\"name\": \"Tom\", \"age\": 10},\n{\"name\": \"Mark\", \"age\": 5},\n{\"name\": \"Pam\", \"age\": 7}\n]\n\nand by searching \"Pam\" as name, I want to retrieve the related dictionary: {name: \"Pam\", age: 7}\nHow to achieve this ?\n", "output": "You can use a generator expression:\n>>> dicts = [\n... { \"name\": \"Tom\", \"age\": 10 },\n... { \"name\": \"Mark\", \"age\": 5 },\n... { \"name\": \"Pam\", \"age\": 7 },\n... { \"name\": \"Dick\", \"age\": 12 }\n... ]\n\n>>> (item for item in dicts if item[\"name\"] == \"Pam\").next()\n{'age': 7, 'name': 'Pam'}\n\n"} {"input": "Python: create object and add attributes to it I want to create a dynamic object (inside another object) in Python and then add attributes to it.\nI tried:\nobj = someobject\nobj.a = object()\nsetattr(obj.a, 'somefield', 'somevalue')\n\nbut this didn't work.\nAny ideas?\nedit:\nI am setting the attributes from a for loop which loops through a list of values, e.g.\nparams = ['attr1', 'attr2', 'attr3']\nobj = someobject\nobj.a = object()\n\nfor p in params:\n obj.a.p # where p comes from for loop variable\n\nIn the above example I would get obj.a.attr1, obj.a.attr2, obj.a.attr3. \nI used the setattr function because I didn't know how to do obj.a.NAME from a for loop.\nHow would I set the attribute based on the value of p in the example above?\n", "output": "The built-in object can be instantiated but can't have any attributes set on it. (I wish it could, for this exact purpose.) It doesn't have a __dict__ to hold the attributes.\nI generally just do this:\nclass Object(object):\n pass\n\na = Object()\na.somefield = somevalue\n\nWhen I can, I give the Object class a more meaningful name, depending on what kind of data I'm putting in it.\nSome people do a different thing, where they use a sub-class of dict that allows attribute access to get at the keys. (d.key instead of d['key'])\nEdit: For the addition to your question, using setattr is fine. You just can't use setattr on object() instances.\nparams = ['attr1', 'attr2', 'attr3']\nfor p in params:\n setattr(obj.a, p, value)\n\n"} {"input": "String Interpolation in Python Is there a Python equivalent to Ruby's string interpolation?\nRuby example:\nname = \"Spongebob Squarepants\"\nputs \"Who lives in a Pineapple under the sea? \\n#{name}.\"\n\nWhilst I could succeed using string concatenation in Python, it's seemingly verbose to me coming from Ruby.\n", "output": "Python 3.6 will add literal string interpolation similar to Ruby's string interpolation. Starting with that version of Python (which is scheduled to be released by the end of 2016), you will be able to include expressions in \"f-strings\", e.g.\nname = \"Spongebob Squarepants\"\nprint(f\"Who lives in a Pineapple under the sea? {name}.\")\n\nPrior to 3.6, the closest you can get to this is\nname = \"Spongebob Squarepants\"\nprint(\"Who lives in a Pineapple under the sea? %(name)s.\" % locals())\n\nThe % operator can be used for string interpolation in Python. The first operand is the string to be interpolated, the second can have different types including a \"mapping\", mapping field names to the values to be interpolated. Here I used the dictionary of local variables locals() to map the field name name to its value as a local variable.\nThe same code using the .format() method of recent Python versions would look like this:\nname = \"Spongebob Squarepants\"\nprint(\"Who lives in a Pineapple under the sea? {name!s}.\".format(**locals()))\n\nThere is also the string.Template class:\ntmpl = string.Template(\"Who lives in a Pineapple under the sea? $name.\")\nprint(tmpl.substitute(name=\"Spongebob Squarepants\"))\n\n"} {"input": "What is the difference between pip and conda? I know pip is a package manager for python packages. However, I saw the installation on iPython's website use conda to install iPython.\nCan I use pip to install iPython? Why should I use conda as another python package manager when I already have pip?\nWhat is the difference between pip and conda?\n", "output": "Quoting from the Conda blog:\n\nHaving been involved in the python world for so long, we are all aware of pip, easy_install, and virtualenv, but these tools did not meet all of our specific requirements. The main problem is that they are focused around Python, neglecting non-Python library dependencies, such as HDF5, MKL, LLVM, etc., which do not have a setup.py in their source code and also do not install files into Python\u00e2\u0080\u0099s site-packages directory.\n\nSo Conda is a packaging tool and installer that aims to do more than what pip does; handle library dependencies outside of the Python packages as well as the Python packages themselves. Conda also creates a virtual environment, like virtualenv does.\nAs such, Conda should be compared to Buildout perhaps, another tool that lets you handle both Python and non-Python installation tasks.\nBecause Conda introduces a new packaging format, you cannot use pip and Conda interchangeably; pip cannot install the Conda package format. You can use the two tools side by side but they do not interoperate either.\n"} {"input": "Find full path of the Python interpreter? How do I find the full path of the currently running Python interpreter from within the currently executing Python script?\n", "output": "sys.executable contains full path of the currently running Python interpreter.\nimport sys\n\nprint(sys.executable)\n\nwhich is now documented here\n"} {"input": "In Python, what is the difference between '/' and '//' when used for division? Is there a benefit to using one over the other? They both seem to return the same results.\n>>> 6/3\n2\n>>> 6//3\n2\n\n", "output": "In Python 3.0, 5 / 2 will return 2.5 and 5 // 2 will return 2. The former is floating point division, and the latter is floor division, sometimes also called integer division.\nIn Python 2.2 or later in the 2.x line, there is no difference for integers unless you perform a from __future__ import division, which causes Python 2.x to adopt the behavior of 3.0\nRegardless of the future import, 5.0 // 2 will return 2.0 since that's the floor division result of the operation.\nYou can find a detailed description at https://docs.python.org/whatsnew/2.2.html#pep-238-changing-the-division-operator\n"} {"input": "Redirect stdout to a file in Python? How do I redirect stdout to an arbitrary file in Python?\nWhen a long-running Python script (e.g, web application) is started from within the ssh session and backgounded, and the ssh session is closed, the application will raise IOError and fail the moment it tries to write to stdout. I needed to find a way to make the application and modules output to a file rather than stdout to prevent failure due to IOError. Currently, I employ nohup to redirect output to a file, and that gets the job done, but I was wondering if there was a way to do it without using nohup, out of curiosity.\nI have already tried sys.stdout = open('somefile', 'w'), but this does not seem to prevent some external modules from still outputting to terminal (or maybe the sys.stdout = ... line did not fire at all). I know it should work from simpler scripts I've tested on, but I also didn't have time yet to test on a web application yet.\n", "output": "If you want to do the redirection within the Python script, set sys.stdout to an file object does the trick:\nimport sys\nsys.stdout = open('file', 'w')\nprint 'test'\n\nA far more common method is to use shell redirection when executing (same on Windows and Linux):\n$ python foo.py > file\n\n"} {"input": "Find and list duplicates in Python list How can I find the duplicates in a Python list and create another list of the duplicates? The list is just integers.\n", "output": "To remove duplicates use set(a), to print duplicates - something like\na = [1,2,3,2,1,5,6,5,5,5]\n\nimport collections\nprint [item for item, count in collections.Counter(a).items() if count > 1]\n\n## [1, 2, 5]\n\nNote that Counter is not particularly efficient (timings) and probably an overkill here, set will perform better:\nseen = set()\nuniq = []\nfor x in a:\n if x not in seen:\n uniq.append(x)\n seen.add(x)\n\nor, more concisely:\nseen = set()\nuniq = [x for x in a if x not in seen and not seen.add(x)] \n\nI don't recommend the latter style though.\n"} {"input": "What's the difference between filter and filter_by in SQLAlchemy? Could anyone explain the difference between filter and filter_by functions in SQLAlchemy? I am confused and can't really see the difference. Which one should I be using?\n", "output": "filter_by is used for simple queries on the column names like\ndb.users.filter_by(name='Joe')\nThe same can be accomplished with filter by writing\ndb.users.filter(db.users.name=='Joe')\nbut you can also write more powerful queries containing expressions like\ndb.users.filter(or_(db.users.name=='Ryan', db.users.country=='England'))\n"} {"input": "How do I calculate the date six months from the current date using the datetime Python module? I am using the datetime Python module. I am looking to calculate the date 6 months from the current date. Could someone give me a little help doing this?\nThe reason I want to generate a date 6 months from the current date is to produce a Review Date. If the user enters data into the system it will have a review date of 6 months from the date they entered the data. \n", "output": "I found this solution to be good. (This uses the python-dateutil extension)\nfrom datetime import date\nfrom dateutil.relativedelta import relativedelta\n\nsix_months = date.today() + relativedelta(months=+6)\n\nThe advantage of this approach is that it takes care of issues with 28, 30, 31 days etc. This becomes very useful in handling business rules and scenarios (say invoice generation etc.)\n$ date(2010,12,31)+relativedelta(months=+1)\n datetime.date(2011, 1, 31)\n\n$ date(2010,12,31)+relativedelta(months=+2)\n datetime.date(2011, 2, 28)\n\n"} {"input": "ImportError: No module named requests Whenever I run this code, I consistently get an error saying No module Named requests:\nimport requests\n\nThe error I get:\n File \"ex2.py\", line 1, in \n import requests\nImportError: No module named requests\n\n", "output": "Requests is not a built in module, so you will have to download it. You can get it here: https://pypi.python.org/pypi/requests\nOSX/Linux\nUse $ sudo pip install requests if you have pip installed\nOn OSX you can also use sudo easy_install -U requests if you have easy_install installed.\nWindows\nUse > Path\\easy_install.exe requests if you have a windows machine, where easy_install can be found in your Python*\\Scripts folder, if it was installed. (Note Path\\easy_install.exe is an example, mine is C:\\Python32\\Scripts\\easy_install.exe)\nIf you don't have easy install and are running on a windows machine, you can get it here: http://www.lfd.uci.edu/~gohlke/pythonlibs/#distribute\nIf you manually want to add a library to a windows machine, you can download the compressed library, uncompress it, and then place it into the Lib folder of your python path.\nFrom Source (Universal)\nFor any missing library, the source is usually available at https://pypi.python.org/pypi/. Then:\nOn mac osx and windows, after downloading the source zip, uncompress it and from the termiminal/cmd run python setup.py install from the uncompressed dir.\n"} {"input": "How do I clone a Django model instance object and save it to the database? Foo.objects.get(pk=\"foo\")\n\n\nIn the database, I want to add another object which is a copy of the object above.\nSuppose my table has one row. I want to insert the first row object into another row with a different primary key. How can I do that?\n", "output": "Just change the primary key of your object and run save().\nobj = Foo.objects.get(pk=\"foo\")\nobj.pk = \"bar\"\nobj.save()\n\nIf you want auto-generated key, set the new key to None.\nMore on UPDATE/INSERT here.\n"} {"input": "In Python, when to use a Dictionary, List or Set? When should I use a dictionary, list or set?\nAre there scenarios that are more suited for each data type?\n", "output": "A list keeps order, dict and set don't: when you care about order, therefore, you must use list (if your choice of containers is limited to these three, of course;-).\ndict associates with each key a value, while list and set just contain values: very different use cases, obviously.\nset requires items to be hashable, list doesn't: if you have non-hashable items, therefore, you cannot use set and must instead use list.\nset forbids duplicates, list does not: also a crucial distinction. (A \"multiset\", which maps duplicates into a different count for items present more than once, can be found in collections.Counter -- you could build one as a dict, if for some weird reason you couldn't import collections, or, in pre-2.7 Python as a collections.defaultdict(int), using the items as keys and the associated value as the count).\nChecking for membership of a value in a set (or dict, for keys) is blazingly fast (taking about a constant, short time), while in a list it takes time proportional to the list's length in the average and worst cases. So, if you have hashable items, don't care either way about order or duplicates, and want speedy membership checking, set is better than list.\n"} {"input": "Making Python loggers output all messages to stdout in addition to log Is there a way to make Python logging using the logging module automatically output things to stdout in addition to the log file where they are supposed to go? For example, I'd like all calls to logger.warning, logger.critical, logger.error to go to their intended places but in addition always be copied to stdout. This is to avoid duplicating messages like:\nmylogger.critical(\"something failed\")\nprint \"something failed\"\n\n", "output": "All logging output is handled by the handlers; just add a logging.StreamHandler() to the root logger.\nHere's an example configuring a stream handler (using stdout instead of the default stderr) and adding it to the root logger:\nimport logging\nimport sys\n\nroot = logging.getLogger()\nroot.setLevel(logging.DEBUG)\n\nch = logging.StreamHandler(sys.stdout)\nch.setLevel(logging.DEBUG)\nformatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')\nch.setFormatter(formatter)\nroot.addHandler(ch)\n\n"} {"input": "Generating an MD5 checksum of a file Is there any simple way of generating (and checking) MD5 checksums of a list of files in Python? (I have a small program I'm working on, and I'd like to confirm the checksums of the files).\n", "output": "There is a way that's pretty memory inefficient.\nsingle file:\nprint hashlib.md5(open(full_path, 'rb').read()).hexdigest()\n\nlist of files:\nimport hashlib\n[(fname, hashlib.md5(open(fname, 'rb').read()).digest()) for fname in fnamelst]\n\nBut, MD5 is known broken and (IMHO) should come with scary deprecation warnings and removed from the library, so here's how you should actually do it:\nimport hashlib\n[(fname, hashlib.sha256(open(fname, 'rb').read()).digest()) for fname in fnamelst]\n\nIf you only want 128 bits worth of digest you can do .digest()[:16].\nThis will give you a list of tuples, each tuple containing the name of its file and its hash.\nAgain I strongly question your use of MD5. You should be at least using SHA1. Some people think that as long as you're not using MD5 for 'cryptographic' purposes, you're fine. But stuff has a tendency to end up being broader in scope than you initially expect, and your casual vulnerability analysis may prove completely flawed. It's best to just get in the habit of using the right algorithm out of the gate. It's just typing a different bunch of letters is all. It's not that hard.\nHere is a way that is more complex, but memory efficient:\nimport hashlib\ndef hashfile(afile, hasher, blocksize=65536):\n buf = afile.read(blocksize)\n while len(buf) > 0:\n hasher.update(buf)\n buf = afile.read(blocksize)\n return hasher.digest()\n\n[(fname, hashfile(open(fname, 'rb'), hashlib.md5())) for fname in fnamelst]\n\nAnd, again, since MD5 is broken and should not really ever be used anymore:\nimport hashlib\ndef hashfile(afile, hasher, blocksize=65536):\n buf = afile.read(blocksize)\n while len(buf) > 0:\n hasher.update(buf)\n buf = afile.read(blocksize)\n return hasher.digest()\n\n[(fname, hashfile(open(fname, 'rb'), hashlib.sha256())) for fname in fnamelst]\n\nAgain, you can put [:16] after the call to hashfile(...) if you only want 128 bits worth of digest.\n"} {"input": "How to terminate a python subprocess launched with shell=True I'm launching a subprocess with the following command:\np = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True)\n\nHowever, when I try to kill using:\np.terminate()\n\nor \np.kill()\n\nThe command keeps running in the background, so I was wondering how can I actually terminate the process. \nNote that when I run the command with:\np = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE)\n\nIt does terminate successfully when issuing the p.terminate().\n", "output": "Use a process group so as to enable sending a signal to all the process in the groups. For that, you should attach a session id to the parent process of the spawned/child processes, which is a shell in your case. This will make it the group leader of the processes. So now, when a signal is sent to the process group leader, it's transmitted to all of the child processes of this group.\nHere's the code:\nimport os\nimport signal\nimport subprocess\n\n# The os.setsid() is passed in the argument preexec_fn so\n# it's run after the fork() and before exec() to run the shell.\npro = subprocess.Popen(cmd, stdout=subprocess.PIPE, \n shell=True, preexec_fn=os.setsid) \n\nos.killpg(os.getpgid(pro.pid), signal.SIGTERM) # Send the signal to all the process groups\n\n"} {"input": "Installing SciPy with pip It is possible to install NumPy with pip using pip install numpy. \nIs there a similar possibility with SciPy? (Doing pip install scipy does not work.)\n\nUpdate\nThe package SciPy is now available to be installed with pip!\n", "output": "Prerequisite:\nsudo apt-get install build-essential gfortran libatlas-base-dev python-pip python-dev\nsudo pip install --upgrade pip\n\nActual packages:\nsudo pip install numpy\nsudo pip install scipy\n\nOptional packages:\nsudo pip install matplotlib OR sudo apt-get install python-matplotlib\nsudo pip install -U scikit-learn\nsudo pip install pandas\n\nsrc\n"} {"input": "How to send an email with Gmail as provider using Python? I am trying to send email (Gmail) using python, but I am getting following error.\nTraceback (most recent call last): \nFile \"emailSend.py\", line 14, in \nserver.login(username,password) \nFile \"/usr/lib/python2.5/smtplib.py\", line 554, in login \nraise SMTPException(\"SMTP AUTH extension not supported by server.\") \nsmtplib.SMTPException: SMTP AUTH extension not supported by server.\n\nThe Python script is the following.\nimport smtplib\nfromaddr = 'user_me@gmail.com'\ntoaddrs = 'user_you@gmail.com'\nmsg = 'Why,Oh why!'\nusername = 'user_me@gmail.com'\npassword = 'pwd'\nserver = smtplib.SMTP('smtp.gmail.com:587')\nserver.starttls()\nserver.login(username,password)\nserver.sendmail(fromaddr, toaddrs, msg)\nserver.quit()\n\n", "output": "def send_email(user, pwd, recipient, subject, body):\n import smtplib\n\n gmail_user = user\n gmail_pwd = pwd\n FROM = user\n TO = recipient if type(recipient) is list else [recipient]\n SUBJECT = subject\n TEXT = body\n\n # Prepare actual message\n message = \"\"\"From: %s\\nTo: %s\\nSubject: %s\\n\\n%s\n \"\"\" % (FROM, \", \".join(TO), SUBJECT, TEXT)\n try:\n server = smtplib.SMTP(\"smtp.gmail.com\", 587)\n server.ehlo()\n server.starttls()\n server.login(gmail_user, gmail_pwd)\n server.sendmail(FROM, TO, message)\n server.close()\n print 'successfully sent the mail'\n except:\n print \"failed to send mail\"\n\nif you want to use Port 465 you have to create an SMTP_SSL object:\n# SMTP_SSL Example\nserver_ssl = smtplib.SMTP_SSL(\"smtp.gmail.com\", 465)\nserver_ssl.ehlo() # optional, called by login()\nserver_ssl.login(gmail_user, gmail_pwd) \n# ssl server doesn't support or need tls, so don't call server_ssl.starttls() \nserver_ssl.sendmail(FROM, TO, message)\n#server_ssl.quit()\nserver_ssl.close()\nprint 'successfully sent the mail'\n\n"} {"input": "Stripping everything but alphanumeric chars from a string in Python What is the best way to strip all non alphanumeric characters from a string, using Python?\nThe solutions presented in the PHP variant of this question will probably work with some minor adjustments, but don't seem very 'pythonic' to me.\nFor the record, I don't just want to strip periods and commas (and other punctuation), but also quotes, brackets, etc.\n", "output": "I just timed some functions out of curiosity. In these tests I'm removing non-alphanumeric characters from the string string.printable (part of the built-in string module).\n$ python -m timeit -s \\\n \"import string\" \\\n \"''.join(ch for ch in string.printable if ch.isalnum())\" \n10000 loops, best of 3: 57.6 usec per loop\n\n$ python -m timeit -s \\\n \"import string\" \\\n \"filter(str.isalnum, string.printable)\" \n10000 loops, best of 3: 37.9 usec per loop\n\n$ python -m timeit -s \\\n \"import re, string\" \\\n \"re.sub('[\\W_]', '', string.printable)\"\n10000 loops, best of 3: 27.5 usec per loop\n\n$ python -m timeit -s \\\n \"import re, string\" \\\n \"re.sub('[\\W_]+', '', string.printable)\" \n100000 loops, best of 3: 15 usec per loop\n\n$ python -m timeit -s \\\n \"import re, string; pattern = re.compile('[\\W_]+')\" \\\n \"pattern.sub('', string.printable)\" \n100000 loops, best of 3: 11.2 usec per loop\n\n"} {"input": "Accessing dict keys like an attribute in Python? I find it more conveniant to access dict keys as obj.foo instead of obj['foo'], so I wrote this snippet:\nclass AttributeDict(dict):\n def __getattr__(self, attr):\n return self[attr]\n def __setattr__(self, attr, value):\n self[attr] = value\n\nHowever, I assume there must be some reason that Python doesn't provide this functionality out of the box. What would be the caveats and pitfalls of accessing dict keys in this manner?\n", "output": "The best way to do this is:\nclass AttrDict(dict):\n def __init__(self, *args, **kwargs):\n super(AttrDict, self).__init__(*args, **kwargs)\n self.__dict__ = self\n\nSome pros:\n\nIt actually works!\nNo dictionary class methods are shadowed (e.g. .keys() work just fine)\nAttributes and items are always in sync\nTrying to access non-existent key as an attribute correctly raises AttributeError instead of KeyError\n\nCons:\n\nMethods like .keys() will not work just fine if they get overwritten by incoming data\nCauses a memory leak in Python < 2.7.4 / Python3 < 3.2.3\nPylint goes bananas with E1123(unexpected-keyword-arg) and E1103(maybe-no-member)\nFor the uninitiated it seems like pure magic.\n\nA short explanation on how this works\n\nAll python objects internally store their attributes in a dictionary that is named __dict__.\nThere is no requirement that the internal dictionary __dict__ would need to be \"just a plain dict\", so we can assign any subclass of dict() to the internal dictionary.\nIn our case we simply assign the AttrDict() instance we are instantiating (as we are in __init__).\nBy calling super()'s __init__() method we made sure that it (already) behaves exactly like a dictionary, since that function calls all the dictionary instantiation code.\n\nOne reason why Python doesn't provide this functionality out of the box\nAs noted in the \"cons\" list, this combines the namespace of stored keys (which may come from arbitrary and/or untrusted data!) with the namespace of builtin dict method attributes. For example:\nd = AttrDict()\nd.update({'items':[\"jacket\", \"necktie\", \"trousers\"]})\nfor k, v in d.items(): # TypeError: 'list' object is not callable\n print \"Never reached!\"\n\n"} {"input": "Django: \"projects\" vs \"apps\" I have a fairly complex \"product\" I'm getting ready to build using Django. I'm going to avoid using the terms \"project\" and \"application\" in this context, because I'm not clear on their specific meaning in Django.\nProjects can have many apps. Apps can be shared among many projects. Fine.\nI'm not reinventing the blog or forum - I don't see any portion of my product being reusable in any context. Intuitively, I would call this one \"application.\" Do I then do all my work in a single \"app\" folder?\nIf so... in terms of Django's project.app namespace, my inclination is to use myproduct.myproduct, but of course this isn't allowed (but the application I'm building is my project, and my project is an application!). I'm therefore lead to believe that perhaps I'm supposed to approach Django by building one app per \"significant\" model, but I don't know where to draw the boundaries in my schema to separate it into apps - I have a lot of models with relatively complex relationships.\nI'm hoping there's a common solution to this...\n", "output": "\nTry to answer question: \"What does my\n application do?\". If you cannot answer\n in a single sentence, then maybe you can\n split it into several apps with cleaner\n logic.\n\nI read this thought somewhere soon after I've started to work with django and I find that I ask this question of myself quite often and it helps me. \nYour apps don't have to be reusable, they can depend on each other, but they should do one thing.\n"} {"input": "Python exit commands - why so many and when should each be used? It seems that python supports many different commands to stop script execution.The choices I've found are: quit(), exit(), sys.exit(), os._exit() \nHave I missed any?\nWhat's the difference between them? When would you use each?\n", "output": "Let me give some information on them:\n\nquit raises the SystemExit exception behind the scenes. \nFurthermore, if you print it, it will give a message:\n>>> print (quit)\nUse quit() or Ctrl-Z plus Return to exit\n>>>\n\nThis functionality was included to help people who do not know Python. After all, one of the most likely things a newbie will try to exit Python is typing in quit.\nNevertheless, quit should not be used in production code. This is because it only works if the site module is loaded. Instead, this function should only be used in the interpreter.\nexit is an alias for quit (or vice-versa). They exist together simply to make Python more user-friendly. \nFurthermore, it too gives a message when printed:\n>>> print (exit)\nUse exit() or Ctrl-Z plus Return to exit\n>>>\n\nHowever, like quit, exit is considered bad to use in production code and should be reserved for use in the interpreter. This is because it too relies on the site module.\nsys.exit raises the SystemExit exception in the background. This means that it is the same as quit and exit in that respect. \nUnlike those two however, sys.exit is considered good to use in production code. This is because the sys module will always be there.\nos._exit exits the program without calling cleanup handlers, flushing stdio buffers, etc. Thus, it is not a standard way to exit and should only be used in special cases. The most common of these is in the child process(es) created by os.fork.\nNote that, of the four methods given, only this one is unique in what it does.\n\nSummed up, all four methods exit the program. However, the first two are considered bad to use in production code and the last is a non-standard, dirty way that is only used in special scenarios. So, if you want to exit a program normally, go with the third method: sys.exit.\n\nOr, even better in my opinion, you can just do directly what sys.exit does behind the scenes and run:\nraise SystemExit\n\nThis way, you do not need to import sys first. \nHowever, this choice is simply one on style and is purely up to you.\n"} {"input": "Get the name of current script with Python I'm trying to get the name of the Python script that is currently running.\nFor example, I have a script called foo.py and I would like to do something like this inside it:\nprint Scriptname\n\nand get: foo.py.\n", "output": "Use __file__. If you want to omit the directory part (which might be present), you can use import os,os.path.basename(__file__).\n"} {"input": "Sort a list of tuples by 2nd item (integer value) I have a list of tuples that looks something like this:\n[('abc', 121),('abc', 231),('abc', 148), ('abc',221)]\n\nI want to sort this list in ascending order by the integer value inside the tuples. Is it possible?\n", "output": "Try using the key keyword with sorted().\nsorted([('abc', 121),('abc', 231),('abc', 148), ('abc',221)], key=lambda x: x[1])\n\nkey should be a function that identifies how to retrieve the comparable element from your data structure. In your case, it is the second element of the tuple, so we access [1].\nFor optimization, see jamylak's response using itemgetter(1), which is essentially a faster version of lambda x: x[1].\n"} {"input": "Print current call stack from a method in Python code In Python, how can I print the current call stack from within a method (for debugging purposes).\n", "output": "Here's an example of getting the stack via the traceback module, and printing it:\nimport traceback\n\ndef f():\n g()\n\ndef g():\n for line in traceback.format_stack():\n print(line.strip())\n\nf()\n\n# Prints:\n# File \"so-stack.py\", line 10, in \n# f()\n# File \"so-stack.py\", line 4, in f\n# g()\n# File \"so-stack.py\", line 7, in g\n# for line in traceback.format_stack():\n\nIf you really only want to print the stack to stderr, you can use:\ntraceback.print_stack()\n\nOr to print to stdout (useful if want to keep redirected output together), use:\ntraceback.print_stack(file=sys.stdout)\n\nBut getting it via traceback.format_stack() lets you do whatever you like with it.\n"} {"input": "How can I tell PyCharm what type a parameter is expected to be? When it comes to constructors, and assignments, and method calls, the PyCharm IDE is pretty good at analyzing my source code and figuring out what type each variable should be. I like it when it's right, because it gives me good code-completion and parameter info, and it gives me warnings if I try to access an attribute that doesn't exist.\nBut when it comes to parameters, it knows nothing. The code-completion dropdowns can't show anything, because they don't know what type the parameter will be. The code analysis can't look for warnings.\nclass Person:\n def __init__(self, name, age):\n self.name = name\n self.age = age\n\npeasant = Person(\"Dennis\", 37)\n# PyCharm knows that the \"peasant\" variable is of type Person\npeasant.dig_filth() # shows warning -- Person doesn't have a dig_filth method\n\nclass King:\n def repress(self, peasant):\n # PyCharm has no idea what type the \"peasant\" parameter should be\n peasant.knock_over() # no warning even though knock_over doesn't exist\n\nKing().repress(peasant)\n# Even if I call the method once with a Person instance, PyCharm doesn't\n# consider that to mean that the \"peasant\" parameter should always be a Person\n\nThis makes a certain amount of sense. Other call sites could pass anything for that parameter. But if my method expects a parameter to be of type, say, pygame.Surface, I'd like to be able to indicate that to PyCharm somehow, so it can show me all of Surface's attributes in its code-completion dropdown, and highlight warnings if I call the wrong method, and so on.\nIs there a way I can give PyCharm a hint, and say \"psst, this parameter is supposed to be of type X\"? (Or perhaps, in the spirit of dynamic languages, \"this parameter is supposed to quack like an X\"? I'd be fine with that.)\n\nEDIT: CrazyCoder's answer, below, does the trick. For any newcomers like me who want the quick summary, here it is:\nclass King:\n def repress(self, peasant):\n \"\"\"\n Exploit the workers by hanging on to outdated imperialist dogma which\n perpetuates the economic and social differences in our society.\n\n @type peasant: Person\n @param peasant: Person to repress.\n \"\"\"\n peasant.knock_over() # Shows a warning. And there was much rejoicing.\n\nThe relevant part is the @type peasant: Person line of the docstring.\nIf you also go to File > Settings > Python Integrated Tools and set \"Docstring format\" to \"Epytext\", then PyCharm's View > Quick Documentation Lookup will pretty-print the parameter information instead of just printing all the @-lines as-is.\n", "output": "Yes, you can use special documentation format for methods and their parameters so that PyCharm can know the type. Recent PyCharm version supports most common doc formats.\nFor example, PyCharm extracts types from @param style comments.\nSee also reStructuredText and docstring conventions (PEP 257).\nAnother option is Python 3 annotations.\nPlease refer to the PyCharm documentation section for more details and samples.\n"} {"input": "How to download image using requests I'm trying to download and save an image from the web using python's requests module.\nHere is the (working) code I used:\nimg = urllib2.urlopen(settings.STATICMAP_URL.format(**data))\nwith open(path, 'w') as f:\n f.write(img.read())\n\nHere is the new (non-working) code using requests:\nr = requests.get(settings.STATICMAP_URL.format(**data))\nif r.status_code == 200:\n img = r.raw.read()\n with open(path, 'w') as f:\n f.write(img)\n\nCan you help me on what attribute from the response to use from requests?\n", "output": "You can either use the response.raw file object, or iterate over the response.\nTo use the response.raw file-like object will not, by default, decode compressed responses (with GZIP or deflate). You can force it to decompress for you anyway by setting the decode_content attribute to True (requests sets it to False to control decoding itself). You can then use shutil.copyfileobj() to have Python stream the data to a file object:\nimport requests\nimport shutil\n\nr = requests.get(settings.STATICMAP_URL.format(**data), stream=True)\nif r.status_code == 200:\n with open(path, 'wb') as f:\n r.raw.decode_content = True\n shutil.copyfileobj(r.raw, f) \n\nTo iterate over the response use a loop; iterating like this ensures that data is decompressed by this stage:\nr = requests.get(settings.STATICMAP_URL.format(**data), stream=True)\nif r.status_code == 200:\n with open(path, 'wb') as f:\n for chunk in r:\n f.write(chunk)\n\nThis'll read the data in 128 byte chunks; if you feel another chunk size works better, use the Response.iter_content() method with a custom chunk size:\nr = requests.get(settings.STATICMAP_URL.format(**data), stream=True)\nif r.status_code == 200:\n with open(path, 'wb') as f:\n for chunk in r.iter_content(1024):\n f.write(chunk)\n\nNote that you need to open the destination file in binary mode to ensure python doesn't try and translate newlines for you. We also set stream=True so that requests doesn't download the whole image into memory first.\n"} {"input": "How to initialize a two-dimensional array in Python? I'm beginning python and I'm trying to use a two-dimensional list, that I initially fill up with the same variable in every place. I came up with this:\ndef initialize_twodlist(foo):\n twod_list = []\n new = []\n for i in range (0, 10):\n for j in range (0, 10):\n new.append(foo)\n twod_list.append(new)\n new = []\n\nIt gives the desired result, but feels like a workaround. Is there an easier/shorter/more elegant way to do this?\n", "output": "A pattern that often came up in Python was\nbar = []\nfor item in some_iterable:\n bar.append(SOME EXPRESSION)\n\nwhich helped motivate the introduction of list comprehensions, which convert that snippet to\nbar = [SOME EXPRESSION for item in some_iterable]\n\nwhich is shorter and sometimes clearer. Usually you get in the habit of recognizing these and often replacing loops with comprehensions.\nYour code follows this pattern twice\ntwod_list = [] \\ \nfor i in range (0, 10): \\\n new = [] \\ can be replaced } this too\n for j in range (0, 10): } with a list /\n new.append(foo) / comprehension /\n twod_list.append(new) /\n\n"} {"input": "Else clause on Python while statement I've noticed the following code is legal in Python. My question is why? Is there a specific reason?\nn = 5\nwhile n != 0:\n print n\n n -= 1\nelse:\n print \"what the...\"\n\n", "output": "The else clause is only executed when your while condition becomes false. If you break out of the loop, or if an exception is raised, it won't be executed. \nOne way to think about it is as an if/else construct with respect to the condition:\nif condition:\n handle_true()\nelse:\n handle_false()\n\nis analogous to the looping construct:\nwhile condition:\n handle_true()\nelse:\n # condition is false now, handle and go on with the rest of the program\n handle_false()\n\nAn example might be along the lines of:\nwhile value < threshold:\n if not process_acceptable_value(value):\n # something went wrong, exit the loop; don't pass go, don't collect 200\n break\n value = update(value)\nelse:\n # value >= threshold; pass go, collect 200\n handle_threshold_reached()\n\n"} {"input": "How do I get the parent directory in Python? Could someone tell me how to get the parent directory of a path in Python in a cross platform way. E.g.\nC:\\Program Files ---> C:\\\n\nand\nC:\\ ---> C:\\\n\nIf the directory doesn't have a parent directory, it returns the directory itself. The question might seem simple but I couldn't dig it up through Google.\n", "output": "Using os.path.dirname:\n>>> os.path.dirname(r'C:\\Program Files')\n'C:\\\\'\n>>> os.path.dirname('C:\\\\')\n'C:\\\\'\n>>>\n\n"} {"input": "What's the bad magic number error? What's the \"Bad magic number\" ImportError in python, and how do I fix it?\nThe only thing I can find online suggests this is caused by compiling a .py -> .pyc file and then trying to use it with the wrong version of python. In my case, however, the file seems to import fine some times but not others, and I'm not sure why.\nThe information python's providing in the traceback isn't particularly helpful (which is why I was asking here...), but here it is in case it helps:\nTraceback (most recent call last):\n File \"run.py\", line 7, in \n from Normalization import Normalizer\n\n", "output": "The magic number comes from UNIX-type systems where the first few bytes of a file held a marker indicating the file type.\nPython puts a similar marker into its pyc files when it creates them.\nThen the python interpreter makes sure this number is correct when loading it.\nAnything that damages this magic number will cause your problem. This includes editing the pyc file or trying to run a pyc from a different version of python (usually later) than your interpreter.\nIf they are your pyc files, just delete them and let the interpreter re-compile the py files. On UNIX type systems, that could be something as simple as:\nrm *.pyc\n\nor:\nfind . -name '*.pyc' -delete\n\nIf they are not yours, you'll have to either get the py files for re-compilation, or an interpreter that can run the pyc files with that particular magic value.\nOne thing that might be causing the intermittent nature. The pyc that's causing the problem may only be imported under certain conditions. It's highly unlikely it would import sometimes. You should check the actual full stack trace when the import fails?\nAs an aside, the first word of all my 2.5.1(r251:54863) pyc files is 62131, 2.6.1(r261:67517) is 62161. The list of all magic numbers can be found in Python/import.c, reproduced here for completeness (current as at the time the answer was posted, it may have changed since then):\n1.5: 20121\n1.5.1: 20121\n1.5.2: 20121\n1.6: 50428\n2.0: 50823\n2.0.1: 50823\n2.1: 60202\n2.1.1: 60202\n2.1.2: 60202\n2.2: 60717\n2.3a0: 62011\n2.3a0: 62021\n2.3a0: 62011\n2.4a0: 62041\n2.4a3: 62051\n2.4b1: 62061\n2.5a0: 62071\n2.5a0: 62081\n2.5a0: 62091\n2.5a0: 62092\n2.5b3: 62101\n2.5b3: 62111\n2.5c1: 62121\n2.5c2: 62131\n2.6a0: 62151\n2.6a1: 62161\n2.7a0: 62171\n\n"} {"input": "How to check whether two lists are circularly identical in Python For instance, I have lists:\na[0] = [1, 1, 1, 0, 0]\na[1] = [1, 1, 0, 0, 1]\na[2] = [0, 1, 1, 1, 0]\n# and so on\n\nThey seem to be different, but if it is supposed that the start and the end are connected, then they are circularly identical.\nThe problem is, each list which I have has a length of 55 and contains only three ones and 52 zeros in it. Without circular condition, there are 26,235 (55 choose 3) lists. However, if the condition 'circular' exists, there are a huge number of circularly identical lists\nCurrently I check circularly identity by following:\ndef is_dup(a, b):\n for i in range(len(a)):\n if a == list(numpy.roll(b, i)): # shift b circularly by i\n return True\n return False\n\nThis function requires 55 cyclic shift operations at the worst case. And there are 26,235 lists to be compared with each other. In short, I need 55 * 26,235 * (26,235 - 1) / 2 = 18,926,847,225 computations. It's about nearly 20 Giga!\nIs there any good way to do it with less computations? Or any data types that supports circular?\n", "output": "First off, this can be done in O(n) in terms of the length of the list\nYou can notice that if you will duplicate your list 2 times ([1, 2, 3]) will be [1, 2, 3, 1, 2, 3] then your new list will definitely hold all possible cyclic lists.\nSo all you need is to check whether the list you are searching is inside a 2 times of your starting list. In python you can achieve this in the following way (assuming that the lengths are the same).\nlist1 = [1, 1, 1, 0, 0]\nlist2 = [1, 1, 0, 0, 1]\nprint ' '.join(map(str, list2)) in ' '.join(map(str, list1 * 2))\n\nSome explanation about my oneliner:\nlist * 2 will combine a list with itself, map(str, [1, 2]) convert all numbers to string and ' '.join() will convert array ['1', '2', '111'] into a string '1 2 111'.\nAs pointed by some people in the comments, oneliner can potentially give some false positives, so to cover all the possible edge cases:\ndef isCircular(arr1, arr2):\n if len(arr1) != len(arr2):\n return False\n\n str1 = ' '.join(map(str, arr1))\n str2 = ' '.join(map(str, arr2))\n if len(str1) != len(str2):\n return False\n\n return str1 in str2 + ' ' + str2\n\nP.S.1 when speaking about time complexity, it is worth noticing that O(n) will be achieved if substring can be found in O(n) time. It is not always so and depends on the implementation in your language (although potentially it can be done in linear time KMP for example).\nP.S.2 for people who are afraid strings operation and due to this fact think that the answer is not good. What important is complexity and speed. This algorithm potentially runs in O(n) time and O(n) space which makes it much better than anything in O(n^2) domain. To see this by yourself, you can run a small benchmark (creates a random list pops the first element and appends it to the end thus creating a cyclic list. You are free to do your own manipulations)\nfrom random import random\nbigList = [int(1000 * random()) for i in xrange(10**6)]\nbigList2 = bigList[:]\nbigList2.append(bigList2.pop(0))\n\n# then test how much time will it take to come up with an answer\nfrom datetime import datetime\nstartTime = datetime.now()\nprint isCircular(bigList, bigList2)\nprint datetime.now() - startTime # please fill free to use timeit, but it will give similar results\n\n0.3 seconds on my machine. Not really long. Now try to compare this with O(n^2) solutions. While it is comparing it, you can travel from US to Australia (most probably by a cruise ship)\n"} {"input": "Pelican 3.3 pelican-quickstart error \"ValueError: unknown locale: UTF-8\" When I was trying to use pelican3.3, I typed the commend \"pelican-quickstart\", some errors showed up.\nThese are the errors:\n(PelicanEnv)59-127-113-90:myblog Richo$ pelican-quickstart\nTraceback (most recent call last):\n File \"/Users/Richo/Dropbox/Github/PelicanEnv/bin/pelican-quickstart\", line 9, in \n load_entry_point('pelican==3.3', 'console_scripts', 'pelican-quickstart')()\n File \"/Users/Richo/Dropbox/Github/PelicanEnv/lib/python2.7/site-packages/pkg_resources.py\", line 378, in load_entry_point\n return get_distribution(dist).load_entry_point(group, name)\n File \"/Users/Richo/Dropbox/Github/PelicanEnv/lib/python2.7/site-packages/pkg_resources.py\", line 2566, in load_entry_point\n return ep.load()\n File \"/Users/Richo/Dropbox/Github/PelicanEnv/lib/python2.7/site-packages/pkg_resources.py\", line 2260, in load\n entry = __import__(self.module_name, globals(),globals(), ['__name__'])\n File \"/Users/Richo/Dropbox/Github/PelicanEnv/lib/python2.7/site-packages/pelican/__init__.py\", line 16, in \n from pelican.generators import (ArticlesGenerator, PagesGenerator,\n File \"/Users/Richo/Dropbox/Github/PelicanEnv/lib/python2.7/site-packages/pelican/generators.py\", line 20, in \n from pelican.readers import Readers\n File \"/Users/Richo/Dropbox/Github/PelicanEnv/lib/python2.7/site-packages/pelican/readers.py\", line 11, in \n import docutils.core\n File \"/Users/Richo/Dropbox/Github/PelicanEnv/lib/python2.7/site-packages/docutils/core.py\", line 20, in \n from docutils import frontend, io, utils, readers, writers\n File \"/Users/Richo/Dropbox/Github/PelicanEnv/lib/python2.7/site-packages/docutils/frontend.py\", line 41, in \n import docutils.utils\n File \"/Users/Richo/Dropbox/Github/PelicanEnv/lib/python2.7/site-packages/docutils/utils/__init__.py\", line 20, in \n import docutils.io\n File \"/Users/Richo/Dropbox/Github/PelicanEnv/lib/python2.7/site-packages/docutils/io.py\", line 18, in \n from docutils.utils.error_reporting import locale_encoding, ErrorString, ErrorOutput\n File \"/Users/Richo/Dropbox/Github/PelicanEnv/lib/python2.7/site-packages/docutils/utils/error_reporting.py\", line 47, in \n locale_encoding = locale.getlocale()[1] or locale.getdefaultlocale()[1]\n File \"/Users/Richo/Dropbox/Github/PelicanEnv/lib/python2.7/locale.py\", line 513, in getdefaultlocale\n return _parse_localename(localename)\n File \"/Users/Richo/Dropbox/Github/PelicanEnv/lib/python2.7/locale.py\", line 445, in _parse_localename\n raise ValueError, 'unknown locale: %s' % localename\nValueError: unknown locale: UTF-8\n\nMy OS is OS X Mavericks.\n", "output": "You could try a solution posted here or here. Basically, add some lines to your ~/.bash_profile:\nexport LC_ALL=en_US.UTF-8\nexport LANG=en_US.UTF-8\n\nThere is an outstanding bug report related to this issue. It appears that Python makes some assumptions about the format of locale names that aren't universally valid. Explicitly setting these environment vars is basically just a workaround for that bug.\n[Edit:] As @asmeurer correctly points out, the above fix assumes English and the U.S. You should really pick your preferred locale from the list given by locale -a (generally one that ends in UTF-8).\n"} {"input": "How to get week number in Python? How to find out what week number is this year on June 16th (wk24) with Python? \n", "output": "datetime.date has a isocalendar() method, which returns a tuple containing the calendar week:\n>>> datetime.date(2010, 6, 16).isocalendar()[1]\n24\n\ndatetime.date.isocalendar() is an instance-method returning a tuple containing year, weeknumber and weekday in respective order for the given date instance.\n"} {"input": "Programmatically saving image to Django ImageField Ok, I've tried about near everything and I cannot get this to work.\n\nI have a Django model with an ImageField on it\nI have code that downloads an image via HTTP (tested and works)\nThe image is saved directly into the 'upload_to' folder (the upload_to being the one that is set on the ImageField)\nAll I need to do is associate the already existing image file path with the ImageField\n\nI've written this code about 6 different ways.\nThe problem I'm running into is all of the code that I'm writing results in the following behavior:\n(1) Django will make a 2nd file, (2) rename the new file, adding an _ to the end of the file name, then (3) not transfer any of the data over leaving it basically an empty re-named file. What's left in the 'upload_to' path is 2 files, one that is the actual image, and one that is the name of the image,but is empty, and of course the ImageField path is set to the empty file that Django try to create.\nIn case that was unclear, I'll try to illustrate:\n## Image generation code runs.... \n/Upload\n generated_image.jpg 4kb\n\n## Attempt to set the ImageField path...\n/Upload\n generated_image.jpg 4kb\n generated_image_.jpg 0kb\n\nImageField.Path = /Upload/generated_image_.jpg\n\nHow can I do this without having Django try to re-store the file? What I'd really like is something to this effect...\nmodel.ImageField.path = generated_image_path\n\n...but of course that doesn't work.\nAnd yes I've gone through the other questions here like this one as well as the django doc on File\nUPDATE\nAfter further testing, it only does this behavior when running under Apache on Windows Server. While running under the 'runserver' on XP it does not execute this behavior. \nI am stumped.\nHere is the code which runs successfully on XP...\nf = open(thumb_path, 'r')\nmodel.thumbnail = File(f)\nmodel.save()\n\n", "output": "I have some code that fetches an image off the web and stores it in a model. The important bits are:\nfrom django.core.files import File # you need this somewhere\nimport urllib\n\n\n# The following actually resides in a method of my model\n\nresult = urllib.urlretrieve(image_url) # image_url is a URL to an image\n\n# self.photo is the ImageField\nself.photo.save(\n os.path.basename(self.url),\n File(open(result[0]))\n )\n\nself.save()\n\nThat's a bit confusing because it's pulled out of my model and a bit out of context, but the important parts are:\n\nThe image pulled from the web is not stored in the upload_to folder, it is instead stored as a tempfile by urllib.urlretrieve() and later discarded.\nThe ImageField.save() method takes a filename (the os.path.basename bit) and a django.core.files.File object.\n\nLet me know if you have questions or need clarification.\nEdit: for the sake of clarity, here is the model (minus any required import statements):\nclass CachedImage(models.Model):\n url = models.CharField(max_length=255, unique=True)\n photo = models.ImageField(upload_to=photo_path, blank=True)\n\n def cache(self):\n \"\"\"Store image locally if we have a URL\"\"\"\n\n if self.url and not self.photo:\n result = urllib.urlretrieve(self.url)\n self.photo.save(\n os.path.basename(self.url),\n File(open(result[0]))\n )\n self.save()\n\n"} {"input": "Can a line of Python code know its indentation nesting level? From something like this:\nprint(get_indentation_level())\n\n print(get_indentation_level())\n\n print(get_indentation_level())\n\nI would like to get something like this:\n1\n2\n3\n\nCan the code read itself in this way?\nAll I want is the output from the more nested parts of the code to be more nested. In the same way that this makes code easier to read, it would make the output easier to read. \nOf course I could implement this manually, using e.g. .format(), but what I had in mind was a custom print function which would print(i*' ' + string) where i is the indentation level. This would be a quick way to make readable output on my terminal. \nIs there a better way to do this which avoids painstaking manual formatting?\n", "output": "If you want indentation in terms of nesting level rather than spaces and tabs, things get tricky. For example, in the following code:\nif True:\n print(\nget_nesting_level())\n\nthe call to get_nesting_level is actually nested one level deep, despite the fact that there is no leading whitespace on the line of the get_nesting_level call. Meanwhile, in the following code:\nprint(1,\n 2,\n get_nesting_level())\n\nthe call to get_nesting_level is nested zero levels deep, despite the presence of leading whitespace on its line.\nIn the following code:\nif True:\n if True:\n print(get_nesting_level())\n\nif True:\n print(get_nesting_level())\n\nthe two calls to get_nesting_level are at different nesting levels, despite the fact that the leading whitespace is identical.\nIn the following code:\nif True: print(get_nesting_level())\n\nis that nested zero levels, or one? In terms of INDENT and DEDENT tokens in the formal grammar, it's zero levels deep, but you might not feel the same way.\n\nIf you want to do this, you're going to have to tokenize the whole file up to the point of the call and count INDENT and DEDENT tokens. The tokenize module would be very useful for such a function:\nimport inspect\nimport tokenize\n\ndef get_nesting_level():\n caller_frame = inspect.currentframe().f_back\n filename, caller_lineno, _, _, _ = inspect.getframeinfo(caller_frame)\n with open(filename) as f:\n indentation_level = 0\n for token_record in tokenize.generate_tokens(f.readline):\n token_type, _, (token_lineno, _), _, _ = token_record\n if token_lineno > caller_lineno:\n break\n elif token_type == tokenize.INDENT:\n indentation_level += 1\n elif token_type == tokenize.DEDENT:\n indentation_level -= 1\n return indentation_level\n\n"} {"input": "How do you log server errors on django sites So, when playing with the development I can just set settings.DEBUG to True and if an error occures I can see it nicely formatted, with good stack trace and request information.\nBut on kind of production site I'd rather use DEBUG=False and show visitors some standard error 500 page with information that I'm working on fixing this bug at this moment ;)\nAt the same time I'd like to have some way of logging all those information (stack trace and request info) to a file on my server - so I can just output it to my console and watch errors scroll, email the log to me every hour or something like this.\nWhat logging solutions would you recomend for a django-site, that would meet those simple requirements? I have the application running as fcgi server and I'm using apache web server as frontend (although thinking of going to lighttpd).\n", "output": "Well, when DEBUG = False, Django will automatically mail a full traceback of any error to each person listed in the ADMINS setting, which gets you notifications pretty much for free. If you'd like more fine-grained control, you can write and add to your settings a middleware class which defines a method named process_exception(), which will have access to the exception that was raised:\nhttp://docs.djangoproject.com/en/dev/topics/http/middleware/#process-exception\nYour process_exception() method can then perform whatever type of logging you'd like: writing to console, writing to a file, etc., etc.\nEdit: though it's a bit less useful, you can also listen for the got_request_exception signal, which will be sent whenever an exception is encountered during request processing:\nhttp://docs.djangoproject.com/en/dev/ref/signals/#got-request-exception\nThis does not give you access to the exception object, however, so the middleware method is much easier to work with.\n"} {"input": "Python: Check if a string represents an int, Without using Try/Except? Is there any way to tell whether a string represents an integer (e.g., '3', '-17' but not '3.14' or 'asfasfas') Without using a try/except mechanism?\nis_int('3.14') = False\nis_int('-7') = True\n\n", "output": "with positive integers you could use .isdigit:\n>>> '16'.isdigit()\nTrue\n\nit doesn't work with negative integers though. suppose you could try the following:\n>>> s = '-17'\n>>> s.startswith('-') and s[1:].isdigit()\nTrue\n\nit won't work with '16.0' format, which is similar to int casting in this sense.\nedit:\ndef check_int(s):\n if s[0] in ('-', '+'):\n \treturn s[1:].isdigit()\n return s.isdigit()\n\n"} {"input": "Using javadoc for Python documentation I am currently beginning with Python and I have a strong PHP background and in PHP I have took the habit of using javadoc as a documentation template.\nI was wondering if javadoc has its place as docstring documentation in Python. Is something like this too elaborate to fit in the Python mindset or should I try to be as concise as possible?\n\"\"\"\nreplaces template place holder with values\n\n@param string timestamp formatted date to display\n@param string priority priority number\n@param string priority_name priority name\n@param string message message to display\n\n@return string formatted string\n\"\"\"\n\nAnd if I am a bit too exhaustive should I go with something like this instead (where most of the documentation doesn't get printed through the __doc__ method)?\n# replaces template place holder with values\n# \n# @param string timestamp formatted date to display\n# @param string priority priority number\n# @param string priority_name priority name\n# @param string message message to display\n# \n# @return string formatted string\n\ndef format(self, timestamp = '', priority = '', priority_name = '', message = ''):\n \"\"\"\n replaces template place holder with values\n \"\"\"\n values = {'%timestamp%' : timestamp,\n '%priorityName%' : priority_name,\n '%priority%' : priority,\n '%message%' : message}\n\n return self.__pattern.format(**values)\n\n", "output": "Have a look at the reStructuredText (also known as \"reST\") format, which is a plaintext/docstring markup format, and probably the most popular in the Python world. And you should certainly look at Sphinx, a tool to generate documentation from reStructuredText (used for eg. the Python documentation itself). Sphinx includes the possibility to extract documentation from the docstrings in your code (see sphinx.ext.autodoc), and recognizes reST field lists following certain conventions. This has probably become (or is becoming) the most popular way to do it.\nYour example could look as follows:\n\"\"\"Replaces template placeholder with values.\n\n:param timestamp: formatted date to display\n:param priority: priority number\n:param priority_name: priority name\n:param message: message to display\n:returns: formatted string\n\"\"\"\n\nOr extended with type information:\n\"\"\"Replaces template placeholder with values.\n\n:param timestamp: formatted date to display\n:type timestamp: str or unicode\n:param priority: priority number\n:type priority: str or unicode\n:param priority_name: priority name\n:type priority_name: str or unicode\n:param message: message to display\n:type message: str or unicode\n:returns: formatted string\n:rtype: str or unicode\n\"\"\"\n\n"} {"input": "'import module' or 'from module import' I've tried to find a comprehensive guide on whether it is best to use import module or from module import. I've just started with Python, with the intention for developing web applications with Django and I'm trying to start off with best practices in mind.\nBasically, I was hoping if anyone could share their experiences, what preferences other developers have and whats the best way to avoid any gotchas down the road.\n", "output": "The difference between import module and from module import foo is mainly subjective. Pick the one you like best and be consistent in your use of it. Here are some points to help you decide.\nimport module\n\nPros:\n\nLess maintenance of your import statements. Don't need to add any additional imports to start using another item from the module\n\nCons:\n\nTyping module.foo in your code can be tedious and redundant (tedium can be minimized by using import module as mo then typing mo.foo)\n\n\nfrom module import foo\n\nPros:\n\nLess typing to use foo\nMore control over which items of a module can be accessed\n\nCons:\n\nTo use a new item from the module you have to update your import statement\nYou lose context about foo. For example, it's less clear what ceil() does compared to math.ceil()\n\n\nEither method is acceptable, but don't use from module import *. \nFor any reasonable large set of code, if you import * you will likely be cementing it into the module, unable to be removed. This is because it is difficult to determine what items used in the code are coming from 'module', making it easy to get to the point where you think you don't use the import any more but it's extremely difficult to be sure.\n"} {"input": "Is it possible to use pip to install a package from a private github repository? As the title suggests I am trying to install a python package from a private github repo. For a public repository I can issue the following command which works fine:\npip install git+git://github.com/django/django.git\n\nHowever if I try this for a private repository:\npip install git+git://github.com/echweb/echweb-utils.git\n\nI get the following output:\nDownloading/unpacking git+git://github.com/echweb/echweb-utils.git\nCloning Git repository git://github.com/echweb/echweb-utils.git to /var/folders/cB/cB85g9P7HM4jcPn7nrvWRU+++TI/-Tmp-/pip-VRsIoo-build\nComplete output from command /usr/local/bin/git clone git://github.com/echweb/echweb-utils.git /var/folders/cB/cB85g9P7HM4jcPn7nrvWRU+++TI/-Tmp-/pip-VRsIoo-build:\nfatal: The remote end hung up unexpectedly\n\nCloning into /var/folders/cB/cB85g9P7HM4jcPn7nrvWRU+++TI/-Tmp-/pip-VRsIoo-build...\n\n----------------------------------------\nCommand /usr/local/bin/git clone git://github.com/echweb/echweb-utils.git /var/folders/cB/cB85g9P7HM4jcPn7nrvWRU+++TI/-Tmp-/pip-VRsIoo-build failed with error code 128\n\nI guess this is because I am trying to access a private repository without providing any authentication. I therefore tried to use git+ssh hoping that pip would use my ssh public key to authenticate:\npip install git+ssh://github.com/echweb/echweb-utils.git\n\nThis gives the following output:\nDownloading/unpacking git+ssh://github.com/echweb/echweb-utils.git\nCloning Git repository ssh://github.com/echweb/echweb-utils.git to /var/folders/cB/cB85g9P7HM4jcPn7nrvWRU+++TI/-Tmp-/pip-DQB8s4-build\nComplete output from command /usr/local/bin/git clone ssh://github.com/echweb/echweb-utils.git /var/folders/cB/cB85g9P7HM4jcPn7nrvWRU+++TI/-Tmp-/pip-DQB8s4-build:\nCloning into /var/folders/cB/cB85g9P7HM4jcPn7nrvWRU+++TI/-Tmp-/pip-DQB8s4-build...\n\nPermission denied (publickey).\n\nfatal: The remote end hung up unexpectedly\n\n----------------------------------------\nCommand /usr/local/bin/git clone ssh://github.com/echweb/echweb-utils.git /var/folders/cB/cB85g9P7HM4jcPn7nrvWRU+++TI/-Tmp-/pip-DQB8s4-build failed with error code 128\n\nDoes anyone know if it what I am trying to achieve is even possible? If so can you please tell me how?\n", "output": "You can use git+ssh URI scheme, but you MUST set username:\npip install git+ssh://git@github.com/echweb/echweb-utils.git\n\nSee git@ part into URI?\nPS: Also read about deploy keys.\nPPS: In my installation the \"git+ssh\" URI scheme works only with \"editable\" requirements:\npip install -e URI#egg=EggName\n\nRemember: Change the : character that git remote -v prints to a / character before using the remote's address in the pip command:\n$ git remote -v\norigin git@github.com:echweb/echweb-utils.git (fetch)\n ^ change this to a '/' character\n\nIf you forget, you will get this error:\nssh: Could not resolve hostname github.com:echweb:\n nodename nor servname provided, or not known\n\n"} {"input": "How to create a zip archive of a directory How can I create a zip archive of a directory structure in Python? \n", "output": "As others have pointed out, you should use zipfile. The documentation tells you what functions are available, but doesn't really explain how you can use them to zip an entire directory. I think it's easiest to explain with some example code:\n#!/usr/bin/env python\nimport os\nimport zipfile\n\ndef zipdir(path, ziph):\n # ziph is zipfile handle\n for root, dirs, files in os.walk(path):\n for file in files:\n ziph.write(os.path.join(root, file))\n\nif __name__ == '__main__':\n zipf = zipfile.ZipFile('Python.zip', 'w', zipfile.ZIP_DEFLATED)\n zipdir('tmp/', zipf)\n zipf.close()\n\nAdapted from: http://www.devshed.com/c/a/Python/Python-UnZipped/\n"} {"input": "Hiding axis text in matplotlib plots I'm trying to plot a figure without tickmarks or numbers on either of the axes (I use axes in the traditional sense, not the matplotlib nomenclature!). An issue I have come across is where matplotlib adjusts the x(y)ticklabels by subtracting a value N, then adds N at the end of the axis.\nThis may be vague, but the following simplified example highlights the issue, with '6.18' being the offending value of N:\nimport matplotlib.pyplot as plt\nimport random\nprefix = 6.18\n\nrx = [prefix+(0.001*random.random()) for i in arange(100)]\nry = [prefix+(0.001*random.random()) for i in arange(100)]\nplt.plot(rx,ry,'ko')\n\nframe1 = plt.gca()\nfor xlabel_i in frame1.axes.get_xticklabels():\n xlabel_i.set_visible(False)\n xlabel_i.set_fontsize(0.0)\nfor xlabel_i in frame1.axes.get_yticklabels():\n xlabel_i.set_fontsize(0.0)\n xlabel_i.set_visible(False)\nfor tick in frame1.axes.get_xticklines():\n tick.set_visible(False)\nfor tick in frame1.axes.get_yticklines():\n tick.set_visible(False)\n\nplt.show()\n\nThe three things I would like to know are:\n\nHow to turn off this behaviour in the first place (although in most cases it is useful, it is not always!) I have looked through matplotlib.axis.XAxis and cannot find anything appropriate\nHow can I make N disappear (i.e. X.set_visible(False))\nIs there a better way to do the above anyway? My final plot would be 4x4 subplots in a figure, if that is relevant.\n\n", "output": "Instead of hiding each element, you can hide the whole axis:\nframe1.axes.get_xaxis().set_visible(False)\nframe1.axes.get_yaxis().set_visible(False)\n\nOr, you can set the ticks to an empty list:\nframe1.axes.get_xaxis().set_ticks([])\nframe1.axes.get_yaxis().set_ticks([])\n\nIn this second option, you can still use plt.xlabel() and plt.ylabel() to add labels to the axes.\n"} {"input": "How do I split a multi-line string into multiple lines? I have a multi-line string literal that I want to do an operation on each line, like so:\ninputString = \"\"\"Line 1\nLine 2\nLine 3\"\"\"\n\nI want to do something like the following:\nfor line in inputString:\n doStuff()\n\n", "output": "inputString.splitlines()\n\nWill give you a list with each item, the splitlines() method is designed to split each line into a list element.\n"} {"input": "How do I get a Cron like scheduler in Python? I'm looking for a library in Python which will provide at and cron like functionality.\nI'd quite like have a pure Python solution, rather than relying on tools installed on the box; this way I run on machines with no cron.\nFor those unfamiliar with cron: you can schedule tasks based upon an expression like: \n 0 2 * * 7 /usr/bin/run-backup # run the backups at 0200 on Every Sunday\n 0 9-17/2 * * 1-5 /usr/bin/purge-temps # run the purge temps command, every 2 hours between 9am and 5pm on Mondays to Fridays.\n\nThe cron time expression syntax is less important, but I would like to have something with this sort of flexibility. \nIf there isn't something that does this for me out-the-box, any suggestions for the building blocks to make something like this would be gratefully received.\nEdit\nI'm not interested in launching processes, just \"jobs\" also written in Python - python functions. By necessity I think this would be a different thread, but not in a different process.\nTo this end, I'm looking for the expressivity of the cron time expression, but in Python. \nCron has been around for years, but I'm trying to be as portable as possible. I cannot rely on its presence.\n", "output": "If you're looking for something lightweight checkout schedule:\nimport schedule\nimport time\n\ndef job():\n print(\"I'm working...\")\n\nschedule.every(10).minutes.do(job)\nschedule.every().hour.do(job)\nschedule.every().day.at(\"10:30\").do(job)\n\nwhile 1:\n schedule.run_pending()\n time.sleep(1)\n\nDisclosure: I'm the author of that library.\n"} {"input": "Django set default form values I have a Model as follows\nclass TankJournal(models.Model):\n user = models.ForeignKey(User)\n tank = models.ForeignKey(TankProfile)\n ts = models.IntegerField(max_length=15)\n title = models.CharField(max_length=50)\n body = models.TextField()\n\nI have a modelform as follows for that model\nclass JournalForm(ModelForm):\n tank = forms.IntegerField(widget=forms.HiddenInput()) \n\n class Meta:\n model = TankJournal\n exclude = ('user','ts')\n\nI want to know how to set the default value for that tank hidden field.. Here is my function to show/save the form so far\ndef addJournal(request, id=0):\n if not request.user.is_authenticated():\n return HttpResponseRedirect('/')\n\n #\n # checking if they own the tank\n #\n from django.contrib.auth.models import User\n user = User.objects.get(pk=request.session['id'])\n\n if request.method == 'POST':\n form = JournalForm(request.POST)\n if form.is_valid():\n obj = form.save(commit=False)\n\n #\n # setting the user and ts\n #\n from time import time\n obj.ts = int(time())\n obj.user = user\n\n obj.tank = TankProfile.objects.get(pk=form.cleaned_data['tank_id'])\n\n #\n # saving the test\n #\n obj.save()\n\n else:\n form = JournalForm()\n\n try:\n tank = TankProfile.objects.get(user=user, id=id)\n except TankProfile.DoesNotExist:\n return HttpResponseRedirect('/error/')\n\n form.tank = id\n return render_to_response('ajax/tank_addJournal.html', {'form': form}, context_instance=RequestContext(request))\n\nthanks!\n", "output": "You can use initial which is explained here\nYou have two options either populate the value when calling form constructor:\nform = JournalForm(initial={'tank': 123})\n\nor set the value in the form definition:\ntank = forms.IntegerField(widget=forms.HiddenInput(), initial=123)\n\n"} {"input": "Flask - configure dev server to be visible across the network I'm not sure if this is Flask specific, but when I run an app in dev mode (http://localhost:5000), I cannot access it from other machines on the network (with http://[dev-host-ip]:5000). With Rails in dev mode, for example, it works fine. I couldn't find any docs regarding the Flask dev server configuration. Any idea what should be configured to enable this?\n", "output": "Add a parameter to your app.run(). By default it runs on localhost, change it to app.run(host= '0.0.0.0') to run on your machines IP address.\nDocumented on the Flask site under \"Externally Visible Server\" on the Quickstart page:\n\nExternally Visible Server\nIf you run the server you will notice that the server is only\n available from your own computer, not from any other in the network.\n This is the default because in debugging mode a user of the\n application can execute arbitrary Python code on your computer. If you\n have debug disabled or trust the users on your network, you can make\n the server publicly available.\nJust change the call of the run() method to look like this:\napp.run(host='0.0.0.0')\nThis tells your operating system to listen on a public IP.\n\n"} {"input": "How can I get the whole request POST body in Python with Flask? I want to be able to get the whole POST body of the request as a string or file handle that I can read but I'm unable to find a way (I'm new to Flask and Python for web at all). This is what I got:\nfrom flask import Flask\n\napp = Flask(__name__)\n@app.route('/', methods=['GET', 'POST'])\ndef parse_request():\n # obtain the whole post body here, in case the request is post...\n\nAS THIS QUESTION IS GETTING MANY VIEWS:\nThe answer to the question linked below is telling a way that works only if the content-type is not recognised, this one instead is asking how to get the whole raw post body regardless of the headers:\nGet raw POST body in Python Flask regardless of Content-Type header\n", "output": "The docs describe the attributes available on the request. In most common cases request.data will be empty because it's used as a fallback:\n\nrequest.data Contains the incoming request data as string in case it came with a mimetype Flask does not handle.\n\n\nrequest.args: the key/value pairs in the URL query string\nrequest.form: the key/value pairs in the body, as sent by a HTML POST form\nrequest.files: the files in the body, which Flask keeps separate from form\nrequest.values: combined args and form, preferring args if keys overlap\n\n"} {"input": "Postgres: INSERT if does not exist already I'm using Python to write to a postgres database:\nsql_string = \"INSERT INTO hundred (name,name_slug,status) VALUES (\"\nsql_string += hundred + \", '\" + hundred_slug + \"', \" + status + \");\"\ncursor.execute(sql_string)\n\nBut because some of my rows are identical, I get the following error:\npsycopg2.IntegrityError: duplicate key value \n violates unique constraint \"hundred_pkey\"\n\nHow can I write an 'INSERT unless this row already exists' SQL statement? \nI've seen complex statements like this recommended:\nIF EXISTS (SELECT * FROM invoices WHERE invoiceid = '12345')\nUPDATE invoices SET billed = 'TRUE' WHERE invoiceid = '12345'\nELSE\nINSERT INTO invoices (invoiceid, billed) VALUES ('12345', 'TRUE')\nEND IF\n\nBut firstly, is this overkill for what I need, and secondly, how can I execute one of those as a simple string?\n", "output": "\nHow can I write an 'INSERT unless this row already exists' SQL statement? \n\nThere is a nice way of doing conditional INSERT in PostgreSQL:\nINSERT INTO example_table\n (id, name)\nSELECT 1, 'John'\nWHERE\n NOT EXISTS (\n SELECT id FROM example_table WHERE id = 1\n );\n\nCAVEAT This approach is not 100% reliable for concurrent write operations, though. There is a very tiny race condition between the SELECT in the NOT EXISTS anti-semi-join and the INSERT itself. It can fail under such conditions.\n"} {"input": "Django auto_now and auto_now_add For Django 1.1.\nI have this in my models.py:\nclass User(models.Model):\n created = models.DateTimeField(auto_now_add=True)\n modified = models.DateTimeField(auto_now=True)\n\nWhen updating a row I get:\n[Sun Nov 15 02:18:12 2009] [error] /home/ptarjan/projects/twitter-meme/django/db/backends/mysql/base.py:84: Warning: Column 'created' cannot be null\n[Sun Nov 15 02:18:12 2009] [error] return self.cursor.execute(query, args)\n\nThe relevant part of my database is:\n `created` datetime NOT NULL,\n `modified` datetime NOT NULL,\n\nIs this cause for concern?\nSide question: in my admin tool, those two fields aren't showing up. Is that expected?\n", "output": "Any field with the auto_now attribute set will also inherit editable=False and therefore will not show up in the admin panel. There has been talk in the past about making the auto_now and auto_now_add arguments go away, and although they still exist, I feel you're better off just using a custom save() method.\nSo, to make this work properly, I would recommend not using auto_now or auto_now_add and instead define your own save() method to make sure that created is only updated if id is not set (such as when the item is first created), and have it update modified every time the item is saved.\nI have done the exact same thing with other projects I have written using Django, and so your save() would look like this:\nfrom django.utils import timezone\n\nclass User(models.Model):\n created = models.DateTimeField(editable=False)\n modified = models.DateTimeField()\n\n def save(self, *args, **kwargs):\n ''' On save, update timestamps '''\n if not self.id:\n self.created = timezone.now()\n self.modified = timezone.now()\n return super(User, self).save(*args, **kwargs)\n\nHope this helps!\nEdit in response to comments:\nThe reason why I just stick with overloading save() vs. relying on these field arguments is two-fold:\n\nThe aforementioned ups and downs with their reliability. These arguments are heavily reliant on the way each type of database that Django knows how to interact with treats a date/time stamp field, and seems to break and/or change between every release. (Which I believe is the impetus behind the call to have them removed altogether).\nThe fact that they only work on DateField, DateTimeField, and TimeField, and by using this technique you are able to automatically populate any field type every time an item is saved.\nUse django.utils.timezone.now() vs. datetime.datetime.now(), because it will return a TZ-aware or naive datetime.datetime object depending on settings.USE_TZ.\n\nTo address why the OP saw the error, I don't know exactly, but it looks like created isn't even being populated at all, despite having auto_now_add=True. To me it stands out as a bug, and underscores item #1 in my little list above: auto_now and auto_now_add are flaky at best. \n"} {"input": "Calculating arithmetic mean (average) in Python Is there a built-in or standard library method in Python to calculate the arithmetic mean (average) of a list of numbers?\n", "output": "I am not aware of anything in the standard library. However, you could use something like:\ndef mean(numbers):\n return float(sum(numbers)) / max(len(numbers), 1)\n\n>>> mean([1,2,3,4])\n2.5\n>>> mean([])\n0.0\n\nIn numpy, there's numpy.mean().\n"} {"input": "Python variable scope error I've been programming for many years, and recently started learning Python. The following code works as expected in both python 2.5 and 3.0 (on OS X if that matters):\na, b, c = (1, 2, 3)\n\nprint(a, b, c)\n\ndef test():\n print(a)\n print(b)\n print(c) # (A)\n #c+=1 # (B)\ntest()\n\nHowever, when I uncomment line (B), I get an UnboundLocalError: 'c' not assigned at line (A). The values of a and b are printed correctly. This has me completely baffled for two reasons:\n\nWhy is there a runtime error thrown at line (A) because of a later statement on line (B)?\nWhy are variables a and b printed as expected, while c raises an error?\n\nThe only explanation I can come up with is that a local variable c is created by the assignment c+=1, which takes precedent over the \"global\" variable c even before the local variable is created. Of course, it doesn't make sense for a variable to \"steal\" scope before it exists.\nCould someone please explain this behavior?\n", "output": "Python treats variables in functions differently depending on whether you assign values to them from within the function or not. If you assign any value to a variable, it is treated by default as a local variable. Therefore, when you uncomment the line, you are attempting to reference a local variable before any value has been assigned to it.\nIf you want the variable c to refer to the global c put\nglobal c\n\nas the first line of the function.\nAs of python 3, there is now\nnonlocal c\n\nthat you can use to refer to the nearest enclosing (not necessarily global) scope.\n"} {"input": "What is the best way to call a Python script from another Python script? I have a script named test1.py which is not in a module. It just has code that should execute when the script itself is run. There are no functions, classes, methods, etc. I have another script which runs as a service. I want to call test1.py from the script running as a service.\nFor example:\nFile test1.py\nprint \"I am a test\"\nprint \"see! I do nothing productive.\"\n\nFile service.py\n# Lots of stuff here\ntest1.py # do whatever is in test1.py\n\nI'm aware of one method which is opening the file, reading the contents, and basically eval'ing it. I'm assuming there's a better way of doing this. Or at least I hope so.\n", "output": "The usual way to do this is something like the following.\ntest1.py\ndef some_func():\n print 'in test 1, unproductive'\n\nif __name__ == '__main__':\n # test1.py executed as script\n # do something\n some_func()\n\nservice.py\nimport test1\n\ndef service_func():\n print 'service func'\n\nif __name__ == '__main__':\n # service.py executed as script\n # do something\n service_func()\n test1.some_func()\n\n"} {"input": "Easiest way to rename a model using Django/South? I've been hunting for an answer to this on South's site, Google, and SO, but couldn't find a simple way to do this. \nI want to rename a Django model using South.\nSay you have the following:\nclass Foo(models.Model):\n name = models.CharField()\n\nclass FooTwo(models.Model):\n name = models.CharField()\n foo = models.ForeignKey(Foo)\n\nand you want to convert Foo to Bar, namely\nclass Bar(models.Model):\n name = models.CharField()\n\nclass FooTwo(models.Model):\n name = models.CharField()\n foo = models.ForeignKey(Bar)\n\nTo keep it simple, I'm just trying to change the name from Foo to Bar, but ignore the foo member in FooTwo for now.\nWhat's the easiest way to do this using South?\n\nI could probably do a data migration, but that seems pretty involved.\nWrite a custom migration, e.g. db.rename_table('city_citystate', 'geo_citystate'), but I'm not sure how to fix the foreign key in this case.\nAn easier way that you know?\n\n", "output": "To answer your first question, the simple model/table rename is pretty straightforward. Run the command:\n./manage.py schemamigration yourapp rename_foo_to_bar --empty\n\n(Update 2: try --auto instead of --empty to avoid the warning below. Thanks to @KFB for the tip.)\nIf you're using an older version of south, you'll need startmigration instead of schemamigration.\nThen manually edit the migration file to look like this:\nclass Migration(SchemaMigration):\n\n def forwards(self, orm):\n db.rename_table('yourapp_foo', 'yourapp_bar')\n\n\n def backwards(self, orm):\n db.rename_table('yourapp_bar','yourapp_foo') \n\nYou can accomplish this more simply using the db_table Meta option in your model class. But every time you do that, you increase the legacy weight of your codebase -- having class names differ from table names makes your code harder to understand and maintain. I fully support doing simple refactorings like this for the sake of clarity.\n(update) I just tried this in production, and got a strange warning when I went to apply the migration. It said:\n\nThe following content types are stale and need to be deleted:\n\n yourapp | foo\n\nAny objects related to these content types by a foreign key will also\nbe deleted. Are you sure you want to delete these content types?\nIf you're unsure, answer 'no'.\n\n\nI answered \"no\" and everything seemed to be fine.\n"} {"input": "In Python, how do I iterate over a dictionary in sorted order? There's an existing function that ends in:\nreturn dict.iteritems()\n\nthat returns an unsorted iterator for a given dictionary. I would like to return an iterator that goes through the items in sorted order. How do I do that?\n", "output": "Haven't tested this very extensively, but works in Python 2.5.2.\n>>> d = {\"x\":2, \"h\":15, \"a\":2222}\n>>> it = iter(sorted(d.iteritems()))\n>>> it.next()\n('a', 2222)\n>>> it.next()\n('h', 15)\n>>> it.next()\n('x', 2)\n>>>\n\n"} {"input": "Is there a way to use PhantomJS in Python? I want to use PhantomJS in Python. I googled this problem but couldn't find proper solutions.\nI find os.popen() may be a good choice. But I couldn't pass some arguments to it.\nUsing subprocess.Popen() may be a proper solution for now. I want to know whether there's a better solution or not.\nIs there a way to use PhantomJS in Python?\n", "output": "The easiest way to use PhantomJS in python is via Selenium. The simplest installation method is\n\nInstall NodeJS\nUsing Node's package manager install phantomjs: npm -g install phantomjs\ninstall selenium (in your virtualenv, if you are using that)\n\nAfter installation, you may use phantom as simple as:\nfrom selenium import webdriver\n\ndriver = webdriver.PhantomJS() # or add to your PATH\ndriver.set_window_size(1024, 768) # optional\ndriver.get('https://google.com/')\ndriver.save_screenshot('screen.png') # save a screenshot to disk\nsbtn = driver.find_element_by_css_selector('button.gbqfba')\nsbtn.click()\n\nIf your system path environment variable isn't set correctly, you'll need to specify the exact path as an argument to webdriver.PhantomJS(). Replace this:\ndriver = webdriver.PhantomJS() # or add to your PATH\n\n... with the following:\ndriver = webdriver.PhantomJS(executable_path='/usr/local/lib/node_modules/phantomjs/lib/phantom/bin/phantomjs')\n\nReferences:\n\nhttp://selenium-python.readthedocs.org/en/latest/api.html\nHow do I set a proxy for phantomjs/ghostdriver in python webdriver?\nhttp://python.dzone.com/articles/python-testing-phantomjs\n\n"} {"input": "Convert a number range to another range, maintaining ratio I'm trying to convert one range of numbers to another, maintaining ratio. Maths is not my strong point.\nI have an image file where point values may range from -16000.00 to 16000.00 though the typical range may be much less. What I want to do is compress these values into the integer range 0-100, where 0 is the value of the smallest point, and 100 is the value of the largest. All points in between should keep a relative ratio even though some precision is being lost I'd like to do this in python but even a general algorithm should suffice. I'd prefer an algorithm where the min/max or either range can be adjusted (ie, the second range could be -50 to 800 instead of 0 to 100).\n", "output": "NewValue = (((OldValue - OldMin) * (NewMax - NewMin)) / (OldMax - OldMin)) + NewMin\n\nOr a little more readable: \nOldRange = (OldMax - OldMin) \nNewRange = (NewMax - NewMin) \nNewValue = (((OldValue - OldMin) * NewRange) / OldRange) + NewMin\n\nOr if you want to protect for the case where the old range is 0 (OldMin = OldMax):\nOldRange = (OldMax - OldMin)\nif (OldRange == 0)\n NewValue = NewMin\nelse\n{\n NewRange = (NewMax - NewMin) \n NewValue = (((OldValue - OldMin) * NewRange) / OldRange) + NewMin\n}\n\nNote that in this case we're forced to pick one of the possible new range values arbitrarily. Depending on context, sensible choices could be: NewMin (see sample), NewMax or (NewMin + NewMax) / 2\n"} {"input": "How to start a background process in Python? I'm trying to port a shell script to the much more readable python version. The original shell script starts several processes (utilities, monitors, etc.) in the background with \"&\". How can I achieve the same effect in python? I'd like these processes not to die when the python scripts complete. I am sure it's related to the concept of a daemon somehow, but I couldn't find how to do this easily.\n", "output": "While jkp's solution works, the newer way of doing things (and the way the documentation recommends) is to use the subprocess module. For simple commands its equivalent, but it offers more options if you want to do something complicated.\nExample for your case:\nimport subprocess\nsubprocess.Popen([\"rm\",\"-r\",\"some.file\"])\n\nThis should run rm -r somefile in the background. But be wary: subprocess.Popen() only runs a process in the background if nothing in the python script depends on the output of the command being run:\nFor example, the following command will not run in the background:\nimport subprocess\nls_output=subprocess.Popen([\"ls\", \"-a\"], stdout=subprocess.PIPE)\n\nSee the documentation here.\nAlso, a point of clarification: \"Background\" purely a shell concept: what you probably want is to spawn a new process. I've used \"background\" here to refer to shell-background-like behavior, but don't mistake this for the process actually being in the background.\n"} {"input": "Why doesn't os.path.join() work in this case? The below code will not join, when debugged the command does not store the whole path but just the last entry.\nos.path.join('/home/build/test/sandboxes/', todaystr, '/new_sandbox/')\n\nWhen I test this it only stores the /new_sandbox/ part of the code.\n", "output": "The latter strings shouldn't start with a slash. If they start with a slash, then they're considered an \"absolute path\" and everything before them is discarded.\nQuoting the Python docs for os.path.join:\n\nIf any component is an absolute path, all previous components (on Windows, including the previous drive letter, if there was one) are thrown away, and joining continues. \n\n"} {"input": "Python setup.py develop vs install Two options in setup.py develop and install are confusing me. According to this site, using develop creates a special link to site-packages directory.\nPeople have suggested that I use python setup.py install for a fresh installation and python setup.py develop after any changes have been made to the setup file.\nCan anyone shed some light on the usage of these commands?\n", "output": "python setup.py install is used to install (typically third party) packages that you're not going to be developing/editing/debugging yourself.\nFor your own stuff, you want to get your package installed and then be able to frequently edit your code and not have to re-install your package\u00e2\u0080\u0094this is exactly what python setup.py develop does: installs the package (typically just a source folder) in a way that allows you to conveniently edit your code after its installed to the (virtual) environment and have the changes take effect immediately.\n"} {"input": "Is there a \"not equal\" operator in Python? How would you say does not equal?\nLike \nif hi == hi:\n print \"hi\"\nelif hi (does not equal) bye:\n print \"no hi\"\n\nIs there a replacement for == that means \"not equal\"?\n", "output": "Use != (preferred) or <> (deprecated). See comparison operators. For comparing object identities, you can use the keyword is and its negation is not.\ne.g.\n1 == 1 # true\n1 != 1 # false\n1 <> 1 # false\n[] is [] # false (distinct objects)\na = b = []; a is b # true (same object)\n\n"} {"input": "How to perform OR condition in django queryset? I want to write a Django query equivalent to this SQL query:\nSELECT * from user where income >= 5000 or income is NULL.\n\nHow to construct the Djagno queryset filter?\nUser.objects.filter(income__gte=5000, income=0)\n\nThis doesn't work, because it ANDs the filters. I want to OR the filters to get union of individual querysets.\n", "output": "from django.db.models import Q\nUser.objects.filter(Q(income__gte=5000) | Q(income__isnull=True))\n\nvia Documentation\n"} {"input": "Sort a list by multiple attributes? I have a list of lists:\n[[12, 'tall', 'blue', 1],\n[2, 'short', 'red', 9],\n[4, 'tall', 'blue', 13]]\n\nIf I wanted to sort by one element, say the tall/short element, I could do it via s = sorted(s, key = itemgetter(1)).\nIf I wanted to sort by both tall/short and colour, I could do the sort twice, once for each element, but is there a quicker way?\n", "output": "A key can be a function that returns a tuple:\ns = sorted(s, key = lambda x: (x[1], x[2]))\n\nOr you can achieve the same using itemgetter:\nimport operator\ns = sorted(s, key = operator.itemgetter(1, 2))\n\nAnd notice that here you can use sort instead of using sorted and then reassigning:\ns.sort(key = operator.itemgetter(1, 2))\n\n"} {"input": "Is Python strongly typed? I've come across links that say Python is a strongly typed language.\nHowever, I thought in strongly typed languages you couldn't do this :\nbob = 1\nbob = \"bob\"\n\nI thought a strongly typed language didn't accept type-changing at run-time. Maybe I've got a wrong (or too simplist) definition of strong/weak types.\nSo, is Python a strongly or weakly typed language?\n", "output": "Python is strongly, dynamically typed.\n\nStrong typing means that the type of a value doesn't suddenly change. A string containing only digits doesn't magically become a number, as may happen in Perl. Every change of type requires an explicit conversion.\nDynamic typing means that runtime objects (values) have a type, as opposed to static typing where variables have a type.\n\nAs for your example\nbob = 1\nbob = \"bob\"\n\nThis works because the variable does not have a type; it can name any object. After bob=1, you'll find that type(bob) returns int, but after bob=\"bob\", it returns str. (Note that type is a regular function, so it evaluates its argument, then returns the type of the value.)\nContrast this with older dialects of C, which were weakly, statically typed, so that pointers and integers were pretty much interchangeable. (Modern ISO C requires conversions in many cases, but my compiler is still lenient about this by default.)\nI must add that the strong vs. weak typing is more of a continuum than a boolean choice. C++ has stronger typing than C (more conversions required), but the type system can be subverted by using pointer casts.\nThe strength of the type system in a dynamic language such as Python is really determined by how its primitives and library functions respond to different types. E.g., + is overloaded so that it works on two numbers or two strings, but not a string and an number. This is a design choice made when + was implemented, but not really a necessity following from the language's semantics. In fact, when you overload + on a custom type, you can make it implicitly convert anything to a number:\ndef to_number(x):\n \"\"\"Try to convert x to a number.\"\"\"\n if x is None:\n return 0\n # more special cases here\n else:\n return float(x) # works for numbers and strings\n\nclass Foo(object):\n def __add__(self, other):\n other = to_number(other)\n # now do the addition\n\n(The only language that I know that is completely strongly typed, aka strictly typed, is Haskell, where types are entirely disjoint and only a controlled form of overloading is possible via type classes.)\n"} {"input": "Python string.join(list) on object array rather than string array In Python, I can do:\n>>> list = ['a', 'b', 'c']\n>>> ', '.join(list)\n'a, b, c'\n\nIs there any easy way to do the same when I have a list of objects?\n>>> class Obj:\n... def __str__(self):\n... return 'name'\n...\n>>> list = [Obj(), Obj(), Obj()]\n>>> ', '.join(list)\nTraceback (most recent call last):\n File \"\", line 1, in \nTypeError: sequence item 0: expected string, instance found\n\nOr do I have to resort to a for loop?\n", "output": "You could use a list comprehension or a generator expression instead:\n', '.join([str(x) for x in list]) # list comprehension\n', '.join(str(x) for x in list) # generator expression\n\n"} {"input": "Python multiprocessing pool.map for multiple arguments In the Python multiprocessing library, is there a variant of pool.map which support multiple arguments?\ntext = \"test\"\ndef harvester(text, case):\n X = case[0]\n return text+ str(X)\n\nif __name__ == '__main__':\n pool = multiprocessing.Pool(processes=6)\n case = RAW_DATASET\n pool.map(harvester(text,case),case, 1)\n pool.close()\n pool.join()\n\n", "output": "\nis there a variant of pool.map which support multiple arguments?\n\nPython 3.3 includes pool.starmap() method:\n#!/usr/bin/env python3\nfrom functools import partial\nfrom itertools import repeat\nfrom multiprocessing import Pool, freeze_support\n\ndef func(a, b):\n return a + b\n\ndef main():\n a_args = [1,2,3]\n second_arg = 1\n with Pool() as pool:\n L = pool.starmap(func, [(1, 1), (2, 1), (3, 1)])\n M = pool.starmap(func, zip(a_args, repeat(second_arg)))\n N = pool.map(partial(func, b=second_arg), a_args)\n assert L == M == N\n\nif __name__==\"__main__\":\n freeze_support()\n main()\n\nFor older versions:\n#!/usr/bin/env python2\nimport itertools\nfrom multiprocessing import Pool, freeze_support\n\ndef func(a, b):\n print a, b\n\ndef func_star(a_b):\n \"\"\"Convert `f([1,2])` to `f(1,2)` call.\"\"\"\n return func(*a_b)\n\ndef main():\n pool = Pool()\n a_args = [1,2,3]\n second_arg = 1\n pool.map(func_star, itertools.izip(a_args, itertools.repeat(second_arg)))\n\nif __name__==\"__main__\":\n freeze_support()\n main()\n\nOutput\n1 1\n2 1\n3 1\n\nNotice how itertools.izip() and itertools.repeat() are used here.\nDue to the bug mentioned by @unutbu you can't use functools.partial() or similar capabilities on Python 2.6, so the simple wrapper function func_star() should be defined explicitly. See also the workaround suggested by uptimebox.\n"} {"input": "Why return NotImplemented instead of raising NotImplementedError Python has a singleton called NotImplemented. \nWhy would someone want to ever return NotImplemented instead of raising the NotImplementedError exception? Won't it just make it harder to find bugs, such as code that executes invalid methods?\n", "output": "It's because __lt__() and related comparison methods are quite commonly used indirectly in list sorts and such. Sometimes the algorithm will choose to try another way or pick a default winner. Raising an exception would break out of the sort unless caught, whereas NotImplemented doesn't get raised and can be used in further tests.\nhttp://jcalderone.livejournal.com/32837.html\nTo summarise that link:\n\n\"NotImplemented signals to the runtime that it should ask someone else to satisfy the operation. In the expression a == b, if a.__eq__(b) returns NotImplemented, then Python tries b.__eq__(a). If b knows enough to return True or False, then the expression can succeed. If it doesn't, then the runtime will fall back to the built-in behavior (which is based on identity for == and !=).\"\n\n"} {"input": "How to download large file in python with requests.py? Requests is a really nice library. I'd like to use it for download big files (>1GB).\nThe problem is it's not possible to keep whole file in memory I need to read it in chunks. And this is a problem with the following code\nimport requests\n\ndef DownloadFile(url)\n local_filename = url.split('/')[-1]\n r = requests.get(url)\n f = open(local_filename, 'wb')\n for chunk in r.iter_content(chunk_size=512 * 1024): \n if chunk: # filter out keep-alive new chunks\n f.write(chunk)\n f.close()\n return \n\nBy some reason it doesn't work this way. It still loads response into memory before save it to a file.\n", "output": "I figured out what should be changed. The trick was to set stream = True in the get() method. \nAfter this python process stopped to suck memory (stays around 30kb regardless size of the download file). \nThank you @danodonovan for you syntax I use it here:\ndef download_file(url):\n local_filename = url.split('/')[-1]\n # NOTE the stream=True parameter\n r = requests.get(url, stream=True)\n with open(local_filename, 'wb') as f:\n for chunk in r.iter_content(chunk_size=1024): \n if chunk: # filter out keep-alive new chunks\n f.write(chunk)\n #f.flush() commented by recommendation from J.F.Sebastian\n return local_filename\n\nSee http://docs.python-requests.org/en/latest/user/advanced/#body-content-workflow for further reference.\n"} {"input": "Iterate a list with indexes in Python I could swear I've seen the function (or method) that takes a list, like this [3, 7, 19] and makes it into iterable list of tuples, like so: [(0,3), (1,7), (2,19)] to use it instead of:\nfor i in range(len(name_of_list)):\n name_of_list[i] = something\n\nbut I can't remember the name and googling \"iterate list\" gets nothing.\n", "output": ">>> a = [3,4,5,6]\n>>> for i, val in enumerate(a):\n... print i, val\n...\n0 3\n1 4\n2 5\n3 6\n>>>\n\n"} {"input": "How to hide output of subprocess in Python 2.7 I'm using eSpeak on Ubuntu and have a Python 2.7 script that prints and speaks a message:\nimport subprocess\ntext = 'Hello World.'\nprint text\nsubprocess.call(['espeak', text])\n\neSpeak produces the desired sounds, but clutters the shell with some errors (ALSA lib..., no socket connect) so i cannot easily read what was printed earlier. Exit code is 0. \nUnfortunately there is no documented option to turn off its verbosity, so I'm looking for a way to only visually silence it and keep the open shell clean for further interaction.\nHow can I do this?\n", "output": "Redirect the output to DEVNULL:\nimport os\nimport subprocess\n\nFNULL = open(os.devnull, 'w')\nretcode = subprocess.call(['echo', 'foo'], stdout=FNULL, stderr=subprocess.STDOUT)\n\nIt is effectively the same as running this shell command:\nretcode = os.system(\"echo 'foo' &> /dev/null\")\n\n"} {"input": "Compact way of writing (a + b == c or a + c == b or b + c == a) Is there a more compact or pythonic way to write the boolean expression\na + b == c or a + c == b or b + c == a\n\nI came up with \na + b + c in (2*a, 2*b, 2*c)\n\nbut that is a little strange.\n", "output": "If we look at the Zen of Python, emphasis mine:\n\nThe Zen of Python, by Tim Peters\nBeautiful is better than ugly.\n Explicit is better than implicit.\nSimple is better than complex.\n Complex is better than complicated.\n Flat is better than nested.\n Sparse is better than dense.\nReadability counts.\n Special cases aren't special enough to break the rules.\n Although practicality beats purity.\n Errors should never pass silently.\n Unless explicitly silenced.\n In the face of ambiguity, refuse the temptation to guess.\nThere should be one-- and preferably only one --obvious way to do it.\n Although that way may not be obvious at first unless you're Dutch.\n Now is better than never.\n Although never is often better than right now.\nIf the implementation is hard to explain, it's a bad idea.\nIf the implementation is easy to explain, it may be a good idea.\n Namespaces are one honking great idea -- let's do more of those!\n\nThe most Pythonic solution is the one that is clearest, simplest, and easiest to explain:\na + b == c or a + c == b or b + c == a\n\nEven better, you don't even need to know Python to understand this code! It's that easy. This is, without reservation, the best solution. Anything else is intellectual masturbation.\nFurthermore, this is likely the best performing solution as well, as it is the only one out of all the proposals that short circuits. If a + b == c, only a single addition and comparison is done. \n"} {"input": "Python Infinity - Any caveats? So Python has positive and negative infinity:\nfloat(\"inf\"), float(\"-inf\")\n\nThis just seems like the type of feature that has to have some caveat. Is there anything I should be aware of? \n", "output": "You can still get not-a-number (NaN) values from simple arithmetic involving inf:\n>>> 0 * float(\"inf\")\nnan\n\nNote that you will normally not get an inf value through usual arithmetic calculations:\n>>> 2.0**2\n4.0\n>>> _**2\n16.0\n>>> _**2\n256.0\n>>> _**2\n65536.0\n>>> _**2\n4294967296.0\n>>> _**2\n1.8446744073709552e+19\n>>> _**2\n3.4028236692093846e+38\n>>> _**2\n1.157920892373162e+77\n>>> _**2\n1.3407807929942597e+154\n>>> _**2\nTraceback (most recent call last):\n File \"\", line 1, in ?\nOverflowError: (34, 'Numerical result out of range')\n\nThe inf value is considered a very special value with unusual semantics, so it's better to know about an OverflowError straight away through an exception, rather than having an inf value silently injected into your calculations.\n"} {"input": "find nearest value in numpy array is there a numpy-thonic way, e.g. function, to find the 'nearest value' in an array? \nexample:\nnp.find_nearest( array, value )\n\n", "output": "import numpy as np\ndef find_nearest(array,value):\n idx = (np.abs(array-value)).argmin()\n return array[idx]\n\narray = np.random.random(10)\nprint(array)\n# [ 0.21069679 0.61290182 0.63425412 0.84635244 0.91599191 0.00213826\n# 0.17104965 0.56874386 0.57319379 0.28719469]\n\nvalue = 0.5\n\nprint(find_nearest(array, value))\n# 0.568743859261\n\n"} {"input": "Case insensitive Python regular expression without re.compile In Python, I can compile a regular expression to be case-insensitive using re.compile:\n>>> s = 'TeSt'\n>>> casesensitive = re.compile('test')\n>>> ignorecase = re.compile('test', re.IGNORECASE)\n>>> \n>>> print casesensitive.match(s)\nNone\n>>> print ignorecase.match(s)\n<_sre.SRE_Match object at 0x02F0B608>\n\nIs there a way to do the same, but without using re.compile. I can't find anything like Perl's i suffix (e.g. m/test/i) in the documentation.\n", "output": "Pass re.IGNORECASE to the flags param of search, match, or sub:\nre.search('test', 'TeSt', re.IGNORECASE)\nre.match('test', 'TeSt', re.IGNORECASE)\nre.sub('test', 'xxxx', 'Testing', flags=re.IGNORECASE)\n\n"} {"input": "Saving utf-8 texts in json.dumps as UTF8, not as \\u escape sequence sample code:\n>>> import json\n>>> json_string = json.dumps(\"\u00d7\u0091\u00d7\u00a8\u00d7\u0099 \u00d7\u00a6\u00d7\u00a7\u00d7\u009c\u00d7\u0094\")\n>>> print json_string\n\"\\u05d1\\u05e8\\u05d9 \\u05e6\\u05e7\\u05dc\\u05d4\"\n\nThe problem: it's not human readable. My (smart) users want to verify or even edit text files with JSON dumps. (and i'd rather not use XML)\nIs there a way to serialize objects into utf-8 json string (instead of \\uXXXX ) ?\nthis doesn't help:\n>>> output = json_string.decode('string-escape')\n\"\\u05d1\\u05e8\\u05d9 \\u05e6\\u05e7\\u05dc\\u05d4\"\n\nthis works, but if any sub-objects is a python-unicode and not utf-8, it'll dump garbage:\n>>> #### ok:\n>>> s= json.dumps( \"\u00d7\u0091\u00d7\u00a8\u00d7\u0099 \u00d7\u00a6\u00d7\u00a7\u00d7\u009c\u00d7\u0094\", ensure_ascii=False) \n>>> print json.loads(s) \n\u00d7\u0091\u00d7\u00a8\u00d7\u0099 \u00d7\u00a6\u00d7\u00a7\u00d7\u009c\u00d7\u0094\n\n>>> #### NOT ok:\n>>> d={ 1: \"\u00d7\u0091\u00d7\u00a8\u00d7\u0099 \u00d7\u00a6\u00d7\u00a7\u00d7\u009c\u00d7\u0094\", 2: u\"\u00d7\u0091\u00d7\u00a8\u00d7\u0099 \u00d7\u00a6\u00d7\u00a7\u00d7\u009c\u00d7\u0094\" }\n>>> print d\n{1: '\\xd7\\x91\\xd7\\xa8\\xd7\\x99 \\xd7\\xa6\\xd7\\xa7\\xd7\\x9c\\xd7\\x94', \n 2: u'\\xd7\\x91\\xd7\\xa8\\xd7\\x99 \\xd7\\xa6\\xd7\\xa7\\xd7\\x9c\\xd7\\x94'}\n>>> s = json.dumps( d, ensure_ascii=False, encoding='utf8')\n>>> print json.loads(s)['1']\n\u00d7\u0091\u00d7\u00a8\u00d7\u0099 \u00d7\u00a6\u00d7\u00a7\u00d7\u009c\u00d7\u0094\n>>> print json.loads(s)['2']\n\u00c3\u0097\u00c3\u0097\u00c2\u00a8\u00c3\u0097 \u00c3\u0097\u00c2\u00a6\u00c3\u0097\u00c2\u00a7\u00c3\u0097\u00c3\u0097\n\ni searched the json.dumps documentation but couldn't find something useful.\nEdit - Solution(?):\ni'll try to sum up the comments and answers by Martijn Pieters:\n(edit: 2nd thought after @Sebastian's comment and about a year later)\n\nthere might be no is a built-in solution in json.dumps.\ni'll have to convert all strings to UTF8 Unicode the object before it's being JSON-ed.\ni'll use Mark's function that converts strings recuresively in a nested object\nthe example I gave depends too much on my computer & IDE environment, and doesn't run the same on all computers.\n\nThank you everybody :) \n", "output": "Use the ensure_ascii=False switch to json.dumps(), then encode the value to UTF-8 manually:\n>>> json_string = json.dumps(u\"\u00d7\u0091\u00d7\u00a8\u00d7\u0099 \u00d7\u00a6\u00d7\u00a7\u00d7\u009c\u00d7\u0094\", ensure_ascii=False).encode('utf8')\n>>> json_string\n'\"\\xd7\\x91\\xd7\\xa8\\xd7\\x99 \\xd7\\xa6\\xd7\\xa7\\xd7\\x9c\\xd7\\x94\"'\n>>> print json_string\n\"\u00d7\u0091\u00d7\u00a8\u00d7\u0099 \u00d7\u00a6\u00d7\u00a7\u00d7\u009c\u00d7\u0094\"\n\nIf you are writing this to a file, you can use io.open() instead of open() to produce a file object that encodes Unicode values for you as you write, then use json.dump() instead to write to that file:\nwith io.open('filename', 'w', encoding='utf8') as json_file:\n json.dump(u\"\u00d7\u0091\u00d7\u00a8\u00d7\u0099 \u00d7\u00a6\u00d7\u00a7\u00d7\u009c\u00d7\u0094\", json_file, ensure_ascii=False)\n\nIn Python 3, the built-in open() is an alias for io.open(). Do note that there is a bug in the json module where the ensure_ascii=False flag can produce a mix of unicode and str objects. The workaround for Python 2 then is:\nwith io.open('filename', 'w', encoding='utf8') as json_file:\n data = json.dumps(u\"\u00d7\u0091\u00d7\u00a8\u00d7\u0099 \u00d7\u00a6\u00d7\u00a7\u00d7\u009c\u00d7\u0094\", ensure_ascii=False)\n # unicode(data) auto-decodes data to unicode if str\n json_file.write(unicode(data))\n\nIf you are passing in byte strings (type str in Python 2, bytes in Python 3) encoded to UTF-8, make sure to also set the encoding keyword:\n>>> d={ 1: \"\u00d7\u0091\u00d7\u00a8\u00d7\u0099 \u00d7\u00a6\u00d7\u00a7\u00d7\u009c\u00d7\u0094\", 2: u\"\u00d7\u0091\u00d7\u00a8\u00d7\u0099 \u00d7\u00a6\u00d7\u00a7\u00d7\u009c\u00d7\u0094\" }\n>>> d\n{1: '\\xd7\\x91\\xd7\\xa8\\xd7\\x99 \\xd7\\xa6\\xd7\\xa7\\xd7\\x9c\\xd7\\x94', 2: u'\\u05d1\\u05e8\\u05d9 \\u05e6\\u05e7\\u05dc\\u05d4'}\n\n>>> s=json.dumps(d, ensure_ascii=False, encoding='utf8')\n>>> s\nu'{\"1\": \"\\u05d1\\u05e8\\u05d9 \\u05e6\\u05e7\\u05dc\\u05d4\", \"2\": \"\\u05d1\\u05e8\\u05d9 \\u05e6\\u05e7\\u05dc\\u05d4\"}'\n>>> json.loads(s)['1']\nu'\\u05d1\\u05e8\\u05d9 \\u05e6\\u05e7\\u05dc\\u05d4'\n>>> json.loads(s)['2']\nu'\\u05d1\\u05e8\\u05d9 \\u05e6\\u05e7\\u05dc\\u05d4'\n>>> print json.loads(s)['1']\n\u00d7\u0091\u00d7\u00a8\u00d7\u0099 \u00d7\u00a6\u00d7\u00a7\u00d7\u009c\u00d7\u0094\n>>> print json.loads(s)['2']\n\u00d7\u0091\u00d7\u00a8\u00d7\u0099 \u00d7\u00a6\u00d7\u00a7\u00d7\u009c\u00d7\u0094\n\nNote that your second sample is not valid Unicode; you gave it UTF-8 bytes as a unicode literal, that would never work:\n>>> s = u'\\xd7\\x91\\xd7\\xa8\\xd7\\x99 \\xd7\\xa6\\xd7\\xa7\\xd7\\x9c\\xd7\\x94'\n>>> print s\n\u00c3\u0097\u00c3\u0097\u00c2\u00a8\u00c3\u0097 \u00c3\u0097\u00c2\u00a6\u00c3\u0097\u00c2\u00a7\u00c3\u0097\u00c3\u0097\n>>> print s.encode('latin1').decode('utf8')\n\u00d7\u0091\u00d7\u00a8\u00d7\u0099 \u00d7\u00a6\u00d7\u00a7\u00d7\u009c\u00d7\u0094\n\nOnly when I encoded that string to Latin 1 (whose unicode codepoints map one-to-one to bytes) then decode as UTF-8 do you see the expected output. That has nothing to do with JSON and everything to do with that you use the wrong input. The result is called a Mojibake.\nIf you got that Unicode value from a string literal, it was decoded using the wrong codec. It could be your terminal is mis-configured, or that your text editor saved your source code using a different codec than what you told Python to read the file with. Or you sourced it from a library that applied the wrong codec. This all has nothing to do with the JSON library.\n"} {"input": "How to delete items from a dictionary while iterating over it? Is it legitimate to delete items from a dictionary in Python while iterating over it?\nFor example:\nfor k, v in mydict.iteritems():\n if k == val:\n del mydict[k]\n\nThe idea is to remove elements that don't meet a certain condition from the dictionary, instead of creating a new dictionary that's a subset of the one being iterated over.\nIs this a good solution? Are there more elegant/efficient ways?\n", "output": "A simple test in the console shows you cannot modify a dictionary while iterating over it:\n>>> mydict = {'one': 1, 'two': 2, 'three': 3, 'four': 4}\n>>> for k, v in mydict.iteritems():\n... if k == 'two':\n... del mydict[k]\n...\n------------------------------------------------------------\nTraceback (most recent call last):\n File \"\", line 1, in \nRuntimeError: dictionary changed size during iteration\n\nAs stated in delnan's answer, deleting entries causes problems when the iterator tries to move onto the next entry. Instead, use the keys() method to get a list of the keys and work with that:\n>>> for k in mydict.keys():\n... if k == 'two':\n... del mydict[k]\n...\n>>> mydict\n{'four': 4, 'three': 3, 'one': 1}\n\nIf you need to delete based on the items value, use the items() method instead:\n>>> for k, v in mydict.items():\n... if v == 3:\n... del mydict[k]\n...\n>>> mydict\n{'four': 4, 'one': 1}\n\n"} {"input": "How to execute a file within the python interpreter? I'm trying to execute a file with python commands from within the interpreter. \nEDIT: I'm trying to use variables and settings from that file, not to invoke a separate process.\n", "output": "Several ways.\nFrom the shell\npython someFile.py\n\nFrom inside IDLE, hit F5.\nIf you're typing interactively, try this.\n>>> variables= {}\n>>> execfile( \"someFile.py\", variables )\n>>> print variables # globals from the someFile module\n\n"} {"input": "Python dictionary: are keys() and values() always the same order? It looks like the lists returned by keys() and values() methods of a dictionary are always a 1-to-1 mapping (assuming the dictionary is not altered between calling the 2 methods).\nFor example:\n>>> d = {'one':1, 'two': 2, 'three': 3}\n>>> k, v = d.keys(), d.values()\n>>> for i in range(len(k)):\n print d[k[i]] == v[i]\n\nTrue\nTrue\nTrue\n\nIf you do not alter the dictionary between calling keys() and calling values(), is it wrong to assume the above for-loop will always print True? I could not find any documentation confirming this.\n", "output": "Found this:\n\nIf items(), keys(), values(),\n iteritems(), iterkeys(), and\n itervalues() are called with no\n intervening modifications to the\n dictionary, the lists will directly\n correspond.\n\nOn 2.x documentation and 3.x documentation.\n"} {"input": "Python: split a list based on a condition? What's the best way, both aesthetically and from a performance perspective, to split a list of items into multiple lists based on a conditional? The equivalent of:\ngood = [x for x in mylist if x in goodvals]\nbad = [x for x in mylist if x not in goodvals]\n\nis there a more elegant way to do this?\nUpdate: here's the actual use case, to better explain what I'm trying to do:\n# files looks like: [ ('file1.jpg', 33L, '.jpg'), ('file2.avi', 999L, '.avi'), ... ]\nIMAGE_TYPES = ('.jpg','.jpeg','.gif','.bmp','.png')\nimages = [f for f in files if f[2].lower() in IMAGE_TYPES]\nanims = [f for f in files if f[2].lower() not in IMAGE_TYPES]\n\n", "output": "good, bad = [], []\nfor x in mylist:\n (bad, good)[x in goodvals].append(x)\n\n"} {"input": "How to use a different version of python during NPM install? Salam (means Hello) :)\nI have terminal access to a VPS running centos 5.9 and default python 2.4.3 installed. I also installed python 2.7.3 via these commands: (I used make altinstall instead of make install)\nwget http://www.python.org/ftp/python/2.7.3/Python-2.7.3.tgz\ntar -xf Python-2.7.3.tgz\ncd Python-2.7.3\n./configure\nmake\nmake altinstall\n\nthen I installed node.js from source via these commands:\npython2.7 ./configure\nmake\nmake install\n\nThe problem is, when I use npm install and try to install a node.js package which requires python > 2.4.3 I get this error:\ngyp ERR! configure error\ngyp ERR! stack Error: Python executable \"python\" is v2.4.3, which is not supported by gyp.\ngyp ERR! stack You can pass the --python switch to point to Python >= v2.5.0 & < 3.0.0.\ngyp ERR! stack at failPythonVersion (/usr/local/lib/node_modules/npm/node_modules/node-gyp/lib/configure.js:125:14)\ngyp ERR! stack at /usr/local/lib/node_modules/npm/node_modules/node-gyp/lib/configure.js:114:9\n\nhow should I \"pass the --python switch to point to Python >= v2.5.0\"?\n", "output": "You can use --python option to npm like so:\nnpm install --python=python2.7\n\nor set it to be used always:\nnpm config set python python2.7\n\nNpm will in turn pass this option to node-gyp when needed.\n(note: I'm the one who opened an issue on Github to have this included in the docs, as there were so many questions about it ;-) )\n"} {"input": "Getting method parameter names in python Given the python function:\ndef aMethod(arg1, arg2):\n pass\n\nHow can I extract the number and names of the arguments. Ie. given that I have a reference to func, I want the func.[something] to return (\"arg1\", \"arg2\")\nThe usage scenario for this is that I have a decorator, and I wish to use the method arguments in the same order that they appear for the actual function as a key. Ie. how would the decorator look that printed \"a,b\" when I call aMethod(\"a\",\"b\")\n", "output": "Take a look at the inspect module - this will do the inspection of the various code object properties for you.\n>>> inspect.getargspec(aMethod)\n(['arg1', 'arg2'], None, None, None)\n\nThe other results are the name of the *args and **kwargs variables, and the defaults provided. ie.\n>>> def foo(a,b,c=4, *arglist, **keywords): pass\n>>> inspect.getargspec(foo)\n(['a', 'b', 'c'], 'arglist', 'keywords', (4,))\n\n"} {"input": "python decorators with parameters I have a problem with the transfer of variable 'insurance_mode' by the decorator. I would do it by the following decorator statement:\n @execute_complete_reservation(True)\n def test_booking_gta_object(self):\n self.test_select_gta_object()\n\nbut unfortunately, this statement does not work. Perhaps maybe there is better way to solve this problem.\ndef execute_complete_reservation(test_case,insurance_mode):\n def inner_function(self,*args,**kwargs):\n self.test_create_qsf_query()\n test_case(self,*args,**kwargs)\n self.test_select_room_option()\n if insurance_mode:\n self.test_accept_insurance_crosseling()\n else:\n self.test_decline_insurance_crosseling()\n self.test_configure_pax_details()\n self.test_configure_payer_details\n\n return inner_function\n\n", "output": "You mean def test_booking_gta_object, right? Anyway, the syntax for decorators with arguments is a bit different - the decorator with arguments should return a function that will take a function and return another function. So it should really return a normal decorator. A bit confusing, right? What I mean is:\ndef decorator(argument):\n def real_decorator(function):\n def wrapper(*args, **kwargs):\n funny_stuff()\n something_with_argument(argument)\n function(*args, **kwargs)\n more_funny_stuff()\n return wrapper\n return real_decorator\n\nHere you can read more on the subject - it's also possible to implement this using callable objects and that is also explained there.\n"} {"input": "Rename a dictionary key Is there a way to rename a dictionary key, without reassigning its value to a new name and removing the old name key; and without iterating through dict key/value?\nIn case of OrderedDict, do the same, while keeping that key's position.\n", "output": "For a regular dict, you can use:\nmydict[new_key] = mydict.pop(old_key)\n\nFor an OrderedDict, I think you must build an entirely new one using a comprehension. \n>>> OrderedDict(zip('123', 'abc'))\nOrderedDict([('1', 'a'), ('2', 'b'), ('3', 'c')])\n>>> oldkey, newkey = '2', 'potato'\n>>> OrderedDict((newkey if k == oldkey else k, v) for k, v in _.viewitems())\nOrderedDict([('1', 'a'), ('potato', 'b'), ('3', 'c')])\n\nModifying the key itself, as this question seems to be asking, is impractical because dict keys are usually immutable objects such as numbers, strings or tuples. Instead of trying to modify the key, reassigning the value to a new key and removing the old key is how you can achieve the \"rename\" in python. \n"} {"input": "Python + Django page redirect How do I accomplish a simple redirect (e.g. cflocation in ColdFusion, or header(location:http://) in Django)?\n", "output": "It's simple:\nfrom django.http import HttpResponseRedirect\n\ndef myview(request):\n ...\n return HttpResponseRedirect(\"/path/\")\n\nMore info in the official Django docs\nUpdate: Django 1.0\nThere is apparently a better way of doing this in Django now using generic views.\nExample -\nfrom django.views.generic.simple import redirect_to\n\nurlpatterns = patterns('', \n (r'^one/$', redirect_to, {'url': '/another/'}),\n\n #etc...\n)\n\nThere is more in the generic views documentation.\nCredit - Carles Barrob\u00c3\u00a9s.\nUpdate #2: Django 1.3+\nIn Django 1.5 *redirect_to* no longer exists and has been replaced by RedirectView. Credit to Yonatan\nfrom django.views.generic import RedirectView\n\nurlpatterns = patterns('',\n (r'^one/$', RedirectView.as_view(url='/another/')),\n)\n\n"} {"input": "How to send email attachments with Python I am having problems understanding how to email an attachment using Python. I have successfully emailed simple messages with the smtplib. Could someone please explain how to send an attachment in an email. I know there are other posts online but as a Python beginner I find them hard to understand. \n", "output": "Here's another, adapted from here:\nimport smtplib\nfrom os.path import basename\nfrom email.mime.application import MIMEApplication\nfrom email.mime.multipart import MIMEMultipart\nfrom email.mime.text import MIMEText\nfrom email.utils import COMMASPACE, formatdate\n\n\ndef send_mail(send_from, send_to, subject, text, files=None,\n server=\"127.0.0.1\"):\n assert isinstance(send_to, list)\n\n msg = MIMEMultipart()\n msg['From'] = send_from\n msg['To'] = COMMASPACE.join(send_to)\n msg['Date'] = formatdate(localtime=True)\n msg['Subject'] = subject\n\n msg.attach(MIMEText(text))\n\n for f in files or []:\n with open(f, \"rb\") as fil:\n part = MIMEApplication(\n fil.read(),\n Name=basename(f)\n )\n part['Content-Disposition'] = 'attachment; filename=\"%s\"' % basename(f)\n msg.attach(part)\n\n\n smtp = smtplib.SMTP(server)\n smtp.sendmail(send_from, send_to, msg.as_string())\n smtp.close()\n\nIt's much the same as the first example... But it should be easier to drop in.\n"} {"input": "Python creating a dictionary of lists I want to create a dictionary whose values are lists. For example: \n{\n 1: ['1'],\n 2: ['1','2'],\n 3: ['2']\n}\n\nIf I do:\nd = dict()\na = ['1', '2']\nfor i in a:\n for j in range(int(i), int(i) + 2): \n d[j].append(i)\n\nI get a KeyError, because d[...] isn't a list. In this case, I can add the following code after the assignment of a to initialize the dictionary. \nfor x in range(1, 4):\n d[x] = list()\n\nIs there a better way to do this? Lets say I don't know the keys I am going to need until I am in the second for loop. For example:\nclass relation:\n scope_list = list()\n...\nd = dict()\nfor relation in relation_list:\n for scope_item in relation.scope_list:\n d[scope_item].append(relation)\n\nAn alternative would then be replacing \nd[scope_item].append(relation)\n\nwith \nif d.has_key(scope_item):\n d[scope_item].append(relation)\nelse:\n d[scope_item] = [relation,]\n\nWhat is the best way to handle this? Ideally, appending would \"just work\". Is there some way to express that I want a dictionary of empty lists, even if I don't know every key when I first create the list?\n", "output": "You can use defaultdict:\n>>> from collections import defaultdict\n>>> d = defaultdict(list)\n>>> for i in a:\n... for j in range(int(i), int(i) + 2):\n... d[j].append(i)\n...\n>>> d\ndefaultdict(, {1: ['1'], 2: ['1', '2'], 3: ['2']})\n>>> d.items()\n[(1, ['1']), (2, ['1', '2']), (3, ['2'])]\n\n"} {"input": "Changing the \"tick frequency\" on x or y axis in matplotlib? I am trying to fix how python plots my data.\nSay \nx = [0,5,9,10,15]\n\nand \ny = [0,1,2,3,4]\n\nThen I would do:\nmatplotlib.pyplot.plot(x,y)\nmatplotlib.pyplot.show()\n\nand the x axis' ticks are plotted in intervals of 5. Is there a way to make it show intervals of 1?\n", "output": "You could explicitly set where you want to tick marks with plt.xticks:\nplt.xticks(np.arange(min(x), max(x)+1, 1.0))\n\n\nFor example,\nimport numpy as np\nimport matplotlib.pyplot as plt\n\nx = [0,5,9,10,15]\ny = [0,1,2,3,4]\nplt.plot(x,y)\nplt.xticks(np.arange(min(x), max(x)+1, 1.0))\nplt.show()\n\n\n(np.arange was used rather than Python's range function just in case min(x) and max(x) are floats instead of ints.) \n\nThe plt.plot (or ax.plot) function will automatically set default x and y limits. If you wish to keep those limits, and just change the stepsize of the tick marks, then you could use ax.get_xlim() to discover what limits Matplotlib has already set. \nstart, end = ax.get_xlim()\nax.xaxis.set_ticks(np.arange(start, end, stepsize))\n\nThe default tick formatter should do a decent job rounding the tick values to a sensible number of significant digits. However, if you wish to have more control over the format, you can define your own formatter. For example,\nax.xaxis.set_major_formatter(ticker.FormatStrFormatter('%0.1f'))\n\nHere's a runnable example:\nimport numpy as np\nimport matplotlib.pyplot as plt\nimport matplotlib.ticker as ticker\n\nx = [0,5,9,10,15]\ny = [0,1,2,3,4]\nfig, ax = plt.subplots()\nax.plot(x,y)\nstart, end = ax.get_xlim()\nax.xaxis.set_ticks(np.arange(start, end, 0.712123))\nax.xaxis.set_major_formatter(ticker.FormatStrFormatter('%0.1f'))\nplt.show()\n\n"} {"input": "Get the cartesian product of a series of lists in Python How can I get the Cartesian product (every possible combination of values) from a group of lists?\nInput: \nsomelists = [\n [1, 2, 3],\n ['a', 'b'],\n [4, 5]\n]\n\nDesired output:\n[(1, 'a', 4), (1, 'a', 5), (1, 'b', 4), (1, 'b', 5), (2, 'a', 4), (2, 'a', 5) ...]\n\n", "output": "In Python 2.6+\nimport itertools\nfor element in itertools.product(*somelists):\n print element\n\nDocumentation:\nPython 2 - itertools.product\n"} {"input": "Why dict.get(key) instead of dict[key]? Today, I came across the dict method get which, given a key in the dictionary, returns the associated value.\nFor what purpose is this function useful? If I wanted to find a value associated with a key in a dictionary, I can just do dict[key], and it returns the same thing:\ndictionary = {\"Name\": \"Harry\", \"Age\": 17}\ndictionary[\"Name\"]\ndictionary.get(\"Name\")\n\n", "output": "It allows you to provide a default value if the key is missing:\ndictionary.get(\"bogus\", None)\n\nreturns None, whereas\ndictionary[\"bogus\"]\n\nwould raise a KeyError.\n"} {"input": "How do I override __getattr__ in Python without breaking the default behavior? I want to override the __getattr__ method on a class to do something fancy but I don't want to break the default behavior.\nWhat's the correct way to do this?\n", "output": "Overriding __getattr__ should be fine -- __getattr__ is only called as a last resort i.e. if there are no attributes in the instance that match the name. For instance, if you access foo.bar, then __getattr__ will only be called if foo has no attribute called bar. If the attribute is one you don't want to handle, raise AttributeError:\nclass Foo(object):\n def __getattr__(self, name):\n if some_predicate(name):\n # ...\n else:\n # Default behaviour\n raise AttributeError\n\nHowever, unlike __getattr__, __getattribute__ will be called first (only works for new style classes i.e. those that inherit from object). In this case, you can preserve default behaviour like so:\nclass Foo(object):\n def __getattribute__(self, name):\n if some_predicate(name):\n # ...\n else:\n # Default behaviour\n return object.__getattribute__(self, name)\n\nSee the Python docs for more.\n"} {"input": "Configuring so that pip install can work from github We'd like to use pip with github to install private packages to our production servers. This question concerns what needs to be in the github repo in order for the install to be successful.\nAssuming the following command line (which authenticates just fine and tries to install):\npip install git+ssh://git@github.com/BlahCo/search/tree/prod_release_branch/ProductName\n\nWhat needs to reside in the ProductName? Is it the contents of what would normally be in the tar file after running setup.py with the sdist option, or is the actual tar.gz file, or something else?\nI'm asking here because I've tried several variations and can't make it work. Any help appreciated.\n", "output": "You need the whole python package, with a setup.py file in it.\nA package named foo would be:\nfoo # the installable package\n\u00e2\u0094\u009c\u00e2\u0094\u0080\u00e2\u0094\u0080 foo\n\u00e2\u0094\u0082\u00c2\u00a0\u00c2\u00a0 \u00e2\u0094\u009c\u00e2\u0094\u0080\u00e2\u0094\u0080 __init__.py\n\u00e2\u0094\u0082\u00c2\u00a0\u00c2\u00a0 \u00e2\u0094\u0094\u00e2\u0094\u0080\u00e2\u0094\u0080 bar.py\n\u00e2\u0094\u0094\u00e2\u0094\u0080\u00e2\u0094\u0080 setup.py\n\nAnd install from github like:\n$ pip install git+git://github.com/myuser/foo.git@v123\nor\n$ pip install git+git://github.com/myuser/foo.git@newbranch\n\nMore info at http://www.pip-installer.org/en/latest/reference/pip_install.html#vcs-support\n"} {"input": "I can't install python-ldap When I run the following command:\nsudo pip install python-ldap\n\nI get this error:\n\nIn file included from Modules/LDAPObject.c:9:\nModules/errors.h:8: fatal error: lber.h: No such file or directory\n\nAny ideas how to fix this?\n", "output": "The website says that python-ldap is based on OpenLDAP, so you need to have the development files (headers) in order to compile the Python module. If you're on Ubuntu, the package is called libldap2-dev.\nsudo apt-get install libsasl2-dev python-dev libldap2-dev libssl-dev\n\n"} {"input": "What do (lambda) function closures capture in Python? Recently I started playing around with Python and I came around something peculiar in the way closures work. Consider the following code:\nadders= [0,1,2,3]\nfor i in [0,1,2,3]:\n adders[i]=lambda a: i+a\n\nprint adders[1](3)\n\nIt builds a simple array of functions that take a single input and return that input added by a number. The functions are constructed in for loop where the iterator i runs from 0 to 3. For each of these number a lambda funciton is created which captures i and adds it to the function's input. The last line calls the second lambda function with 3 as a parameter. To my surprise the output was:\n6\n\nI expected a 4. My reasoning was: in Python everything is an object and thus every variable is essential a pointer to it. When creating the lambda closures for i, I expected it to store a pointer to the integer object currently pointed to by i. That means that when i assigned a new integer object it shouldn't effect the previously created closures. Sadly, inspecting the adders array within a debugger shows that it does. All lambda functions refer to the last value of i, 3, which results in adders1 returning 6.\nWhich me the following:\n\nwhat does the closures capture exactly?\nWhat is the most elegant way to convince that lambda functions to capture the current value of i and in a way that will not be affected when i changes it's value. \n\n", "output": "you may force the capture of a variable using an argument with a default value:\n>>> for i in [0,1,2,3]:\n... adders[i]=lambda a,i=i: i+a # note the dummy parameter with a default value\n...\n>>> print( adders[1](3) )\n4\n\nthe idea is to declare a parameter (cleverly named i) and give it a default value of the variable you want to capture (the value of i)\n"} {"input": "How do I design a class in Python? I've had some really awesome help on my previous questions for detecting paws and toes within a paw, but all these solutions only work for one measurement at a time.\nNow I have data that consists off:\n\nabout 30 dogs; \neach has 24 measurements (divided into several subgroups); \neach measurement has at least 4 contacts (one for each paw) and \n\neach contact is divided into 5 parts and \nhas several parameters, like contact time, location, total force etc.\n\n\n\nObviously sticking everything into one big object isn't going to cut it, so I figured I needed to use classes instead of the current slew of functions. But even though I've read Learning Python's chapter about classes, I fail to apply it to my own code (GitHub link)\nI also feel like it's rather strange to process all the data every time I want to get out some information. Once I know the locations of each paw, there's no reason for me to calculate this again. Furthermore, I want to compare all the paws of the same dog to determine which contact belongs to which paw (front/hind, left/right). This would become a mess if I continue using only functions.\nSo now I'm looking for advice on how to create classes that will let me process my data (link to the zipped data of one dog) in a sensible fashion. \n", "output": "How to design a class.\n\nWrite down the words. You started to do this. Some people don't and wonder why they have problems.\nExpand your set of words into simple statements about what these objects will be doing. That is to say, write down the various calculations you'll be doing on these things. Your short list of 30 dogs, 24 measurements, 4 contacts, and several \"parameters\" per contact is interesting, but only part of the story. Your \"locations of each paw\" and \"compare all the paws of the same dog to determine which contact belongs to which paw\" are the next step in object design.\nUnderline the nouns. Seriously. Some folks debate the value of this, but I find that for first-time OO developers it helps. Underline the nouns.\nReview the nouns. Generic nouns like \"parameter\" and \"measurement\" need to be replaced with specific, concrete nouns that apply to your problem in your problem domain. Specifics help clarify the problem. Generics simply elide details.\nFor each noun (\"contact\", \"paw\", \"dog\", etc.) write down the attributes of that noun and the actions in which that object engages. Don't short-cut this. Every attribute. \"Data Set contains 30 Dogs\" for example is important.\nFor each attribute, identify if this is a relationship to a defined noun, or some other kind of \"primitive\" or \"atomic\" data like a string or a float or something irreducible.\nFor each action or operation, you have to identify which noun has the responsibility, and which nouns merely participate. It's a question of \"mutability\". Some objects get updated, others don't. Mutable objects must own total responsibility for their mutations. \nAt this point, you can start to transform nouns into class definitions. Some collective nouns are lists, dictionaries, tuples, sets or namedtuples, and you don't need to do very much work. Other classes are more complex, either because of complex derived data or because of some update/mutation which is performed.\n\nDon't forget to test each class in isolation using unittest.\nAlso, there's no law that says classes must be mutable. In your case, for example, you have almost no mutable data. What you have is derived data, created by transformation functions from the source dataset. \n"} {"input": "No Multiline Lambda in Python: Why not? I've heard it said that multiline lambdas can't be added in Python because they would clash syntactically with the other syntax constructs in Python. I was thinking about this on the bus today and realized I couldn't think of a single Python construct that multiline lambdas clash with. Given that I know the language pretty well, this surprised me.\nNow, I'm sure Guido had a reason for not including multiline lambdas in the language, but out of curiosity: what's a situation where including a multiline lambda would be ambiguous? Is what I've heard true, or is there some other reason that Python doesn't allow multiline lambdas?\n", "output": "Guido van Rossum (the inventor of Python) answers this exact question himself in an old blog post.\nBasically, he admits that it's theoretically possible, but that any proposed solution would be un-Pythonic:\n\n\"But the complexity of any proposed solution for this puzzle is immense, to me: it requires the parser (or more precisely, the lexer) to be able to switch back and forth between indent-sensitive and indent-insensitive modes, keeping a stack of previous modes and indentation level. Technically that can all be solved (there's already a stack of indentation levels that could be generalized). But none of that takes away my gut feeling that it is all an elaborate Rube Goldberg contraption.\"\n\n"} {"input": "Reusable library to get human readable version of file size? There are various snippets on the web that would give you a function to return human readable size from bytes size:\n>>> human_readable(2048)\n'2 kilobytes'\n>>>\n\nBut is there a Python library that provides this?\n", "output": "Addressing the above \"too small a task to require a library\" issue by a straightforward implementation:\ndef sizeof_fmt(num, suffix='B'):\n for unit in ['','Ki','Mi','Gi','Ti','Pi','Ei','Zi']:\n if abs(num) < 1024.0:\n return \"%3.1f%s%s\" % (num, unit, suffix)\n num /= 1024.0\n return \"%.1f%s%s\" % (num, 'Yi', suffix)\n\nSupports:\n\nall currently known binary prefixes\nnegative and positive numbers\nnumbers larger than 1000 Yobibytes\narbitrary units (maybe you like to count in Gibibits!)\n\nExample:\n>>> sizeof_fmt(168963795964)\n'157.4GiB'\n\nby Fred Cirera\n"} {"input": "How to check if a user is logged in (how to properly use user.is_authenticated)? I am looking over this website but just can't seem to figure out how to do this as it's not working. I need to check if the current site user is logged in (authenticated), and am trying:\nrequest.user.is_authenticated\n\ndespite being sure that the user is logged in, it returns just:\n>\n\nI'm able to do other requests (from the first section in the url above), such as:\nrequest.user.is_active\n\nwhich returns a successful response.\n", "output": "is_authenticated is a function. You should call it like\nif request.user.is_authenticated():\n # do something if the user is authenticated\n\nAs Peter Rowell pointed out, what may be tripping you up is that in the default Django template language, you don't tack on parenthesis to call functions. So you may have seen something like this in template code:\n{% if user.is_authenticated %}\n\nHowever, in Python code, it is indeed a method in the User class.\n"} {"input": "How to change legend size with matplotlib.pyplot Simple question here: I'm just trying to get the size of my legend using matplotlib.pyplot to be smaller (ie, the text to be smaller). The code I'm using goes something like this:\nplot.figure()\nplot.scatter(k,sum_cf, color = 'black', label='Sum of Cause Fractions')\nplot.scatter(k, data[:, 0], color = 'b', label = 'Dis 1: cf = .6, var = .2')\nplot.scatter(k, data[:, 1], color = 'r', label = 'Dis 2: cf = .2, var = .1')\nplot.scatter(k, data[:, 2], color = 'g', label = 'Dis 3: cf = .1, var = .01')\nplot.legend(loc=2)\n\n", "output": "You can set an individual font size for the legend by adjusting the 'prop' keyword. \nplot.legend(loc=2,prop={'size':6})\n\nThis takes a dictionary of keywords corresponding to matplotlib.font_manager.FontProperties properties. See the documentation for legend:\n\nKeyword arguments:\nprop: [ None | FontProperties | dict ]\n A matplotlib.font_manager.FontProperties instance. If prop is a \n dictionary, a new instance will be created with prop. If None, use\n rc settings.\n\n\nIt is also possible, as of version 1.2.1, to use the keyword fontsize.\n"} {"input": "What is the correct syntax for 'else if'? I'm a new Python programmer who is making the leap from 2.6.4 to 3.1.1. Everything has gone fine until I tried to use the 'else if' statement. The interpreter gives me a syntax error after the 'if' in 'else if' for a reason I can't seem to figure out.\ndef function(a):\n if a == '1':\n print ('1a')\n else if a == '2'\n print ('2a')\n else print ('3a')\n\nfunction(input('input:'))\n\nI'm probably missing something very simple; however, I haven't been able to find the answer on my own. \n", "output": "In python \"else if\" is spelled \"elif\".\nAlso, you need a colon after the elif and the else.\nSimple answer to a simple question. I had the same problem, when I first started (in the last couple of weeks).\nSo your code should read:\ndef function(a):\n if a == '1':\n print('1a')\n elif a == '2':\n print('2a')\n else:\n print('3a')\n\nfunction(input('input:'))\n\n"} {"input": "Convert string representation of list to list in Python I was wondering what the simplest way is to convert a string list like the following to a list:\nx = u'[ \"A\",\"B\",\"C\" , \" D\"]'\n\nEven in case user puts spaces in between the commas, and spaces inside of the quotes. I need to handle that as well to:\nx = [\"A\", \"B\", \"C\", \"D\"] \n\nin Python.\nI know I can strip spaces with strip() and split() using the split operator and check for non alphabets. But the code was getting very kludgy. Is there a quick function that I'm not aware of?\n", "output": ">>> import ast\n>>> x = u'[ \"A\",\"B\",\"C\" , \" D\"]'\n>>> x = ast.literal_eval(x)\n>>> x\n['A', 'B', 'C', ' D']\n>>> x = [n.strip() for n in x]\n>>> x\n['A', 'B', 'C', 'D']\n\nast.literal_eval:\n\nSafely evaluate an expression node or a string containing a Python expression. The string or node provided may only consist of the following Python literal structures: strings, numbers, tuples, lists, dicts, booleans, and None.\n\n"} {"input": "Can't install PIL after Mac OS X 10.9 I've just updated my Mac OS to 10.9 and I discovered that some (all?) of my Python modules are not here anymore, especially the Image one.\nSo I try to execute sudo pip install pil, but I get this error:\n/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/usr/include/tk.h:78:11: fatal error: 'X11/Xlib.h' file not found\n\n# include \n\n ^\n\n1 error generated.\n\nerror: command 'cc' failed with exit status 1\n\nMy Xcode is up-to-date and I don't have any idea. Is it possible that PIL is not yet 10.9 compatible ?\n", "output": "Following worked for me:\nln -s /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers/X11 /usr/local/include/X11\nsudo pip install pil\n\nUPDATE:\nBut there is more correct solution below, provided by Will.\n\nopen your terminal and execute:\n xcode-select --install\n\n"} {"input": "Writing Unicode text to a text file? I'm pulling data out of a Google doc, processing it, and writing it to a file (that eventually I will paste into a Wordpress page).\nIt has some non-ASCII symbols. How can I convert these safely to symbols that can be used in HTML source? \nCurrently I'm converting everything to Unicode on the way in, joining it all together in a Python string, then doing: \nimport codecs\nf = codecs.open('out.txt', mode=\"w\", encoding=\"iso-8859-1\")\nf.write(all_html.encode(\"iso-8859-1\", \"replace\"))\n\nThere is an encoding error on the last line: \n\nUnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position\n 12286: ordinal not in range(128)\n\nPartial solution:\nThis Python runs without an error:\nrow = [unicode(x.strip()) if x is not None else u'' for x in row]\nall_html = row[0] + \"
\" + row[1]\nf = open('out.txt', 'w')\nf.write(all_html.encode(\"utf-8\")\n\nBut then if I open the actual text file, I see lots of symbols like:\nQur\u00e2\u0080\u009a\u00c3\u0084\u00c3\u00b4an \n\nMaybe I need to write to something other than a text file? \n", "output": "Deal exclusively with unicode objects as much as possible by decoding things to unicode objects when you first get them and encoding them as necessary on the way out.\nIf your string is actually a unicode object, you'll need to convert it to a unicode-encoded string object before writing it to a file:\nfoo = u'\u00ce\u0094, \u00d0\u0099, \u00d7\u00a7, \u00e2\u0080\u008e \u00d9\u0085, \u00e0\u00b9\u0097, \u00e3\u0081\u0082, \u00e5\u008f\u00b6, \u00e8\u0091\u0089, and \u00eb\u00a7\u0090.'\nf = open('test', 'w')\nf.write(foo.encode('utf8'))\nf.close()\n\nWhen you read that file again, you'll get a unicode-encoded string that you can decode to a unicode object:\nf = file('test', 'r')\nprint f.read().decode('utf8')\n\n"} {"input": "How to install Python MySQLdb module using pip? How can I install the MySQLdb module for Python using pip?\n", "output": "It's easy to do, but hard to remember the correct spelling:\npip install MySQL-python\n\nNote: Some dependencies might have to be in place when running the above command. Some hints on how to install these on various platforms:\nUbuntu 14, Ubuntu 16, Debian 8.6 (jessie)\nsudo apt-get install python-pip python-dev libmysqlclient-dev\n\nFedora 24:\nsudo dnf install python python-devel mysql-devel redhat-rpm-config gcc\n\n"} {"input": "Sleeping in a batch file When writing a batch file to automate something on a Windows box, I've needed to pause its execution for several seconds (usually in a test/wait loop, waiting for a process to start). At the time, the best solution I could find uses ping (I kid you not) to achieve the desired effect. I've found a better write-up of it here, which describes a callable \"wait.bat\", implemented as follows:\n@ping 127.0.0.1 -n 2 -w 1000 > nul\n@ping 127.0.0.1 -n %1% -w 1000> nul\n\nYou can then include calls to wait.bat in your own batch file, passing in the number of seconds to sleep.\nApparently the Windows 2003 Resource Kit provides a Unix-like sleep command (at last!). In the meantime, for those of us still using Windows\u00a0XP, Windows 2000 or (sadly) Windows NT, is there a better way?\nI modified the sleep.py script in the accepted answer, so that it defaults to one second if no arguments are passed on the command line:\nimport time, sys\n\ntime.sleep(float(sys.argv[1]) if len(sys.argv) > 1 else 1)\n\n", "output": "The timeout command is available from Vista onwards.\nc:\\> timeout /?\n\nTIMEOUT [/T] timeout [/NOBREAK]\n\nDescription:\n This utility accepts a timeout parameter to wait for the specified\n time period (in seconds) or until any key is pressed. It also\n accepts a parameter to ignore the key press.\n\nParameter List:\n /T timeout Specifies the number of seconds to wait.\n Valid range is -1 to 99999 seconds.\n\n /NOBREAK Ignore key presses and wait specified time.\n\n /? Displays this help message.\n\nNOTE: A timeout value of -1 means to wait indefinitely for a key press.\n\nExamples:\n TIMEOUT /?\n TIMEOUT /T 10\n TIMEOUT /T 300 /NOBREAK\n TIMEOUT /T -1\n\nNote: Does not work with input redirection - trivial example:\nC:\\>echo 1 | timeout /t 1 /nobreak\nERROR: Input redirection is not supported, exiting the process immediately.\n\n"} {"input": "Why are Python's arrays slow? I expected array.array to be faster than lists, as arrays seem to be unboxed.\nHowever, I get the following result:\nIn [1]: import array\n\nIn [2]: L = list(range(100000000))\n\nIn [3]: A = array.array('l', range(100000000))\n\nIn [4]: %timeit sum(L)\n1 loop, best of 3: 667 ms per loop\n\nIn [5]: %timeit sum(A)\n1 loop, best of 3: 1.41 s per loop\n\nIn [6]: %timeit sum(L)\n1 loop, best of 3: 627 ms per loop\n\nIn [7]: %timeit sum(A)\n1 loop, best of 3: 1.39 s per loop\n\nWhat could be the cause of such a difference?\n", "output": "The storage is \"unboxed\", but every time you access an element Python has to \"box\" it (embed it in a regular Python object) in order to do anything with it. For example, your sum(A) iterates over the array, and boxes each integer, one at a time, in a regular Python int object. That costs time. In your sum(L), all the boxing was done at the time the list was created.\nSo, in the end, an array is generally slower, but requires substantially less memory.\n\nHere's the relevant code from a recent version of Python 3, but the same basic ideas apply to all CPython implementations since Python was first released.\nHere's the code to access a list item:\nPyObject *\nPyList_GetItem(PyObject *op, Py_ssize_t i)\n{\n /* error checking omitted */\n return ((PyListObject *)op) -> ob_item[i];\n}\n\nThere's very little to it: somelist[i] just returns the i'th object in the list (and all Python objects in CPython are pointers to a struct whose initial segment conforms to the layout of a struct PyObject).\nAnd here's the __getitem__ implementation for an array with type code l:\nstatic PyObject *\nl_getitem(arrayobject *ap, Py_ssize_t i)\n{\n return PyLong_FromLong(((long *)ap->ob_item)[i]);\n}\n\nThe raw memory is treated as a vector of platform-native C long integers; the i'th C long is read up; and then PyLong_FromLong() is called to wrap (\"box\") the native C long in a Python long object (which, in Python 3, which eliminates Python 2's distinction between int and long, is actually shown as type int).\nThis boxing has to allocate new memory for a Python int object, and spray the native C long's bits into it. In the context of the original example, this object's lifetime is very brief (just long enough for sum() to add the contents into a running total), and then more time is required to deallocate the new int object.\nThis is where the speed difference comes from, always has come from, and always will come from in the CPython implementation.\n"} {"input": "Binary search (bisection) in Python Is there a library function that performs binary search on a list/tuple and return the position of the item if found and 'False' (-1, None, etc.) if not?\nI found the functions bisect_left/right in the bisect module, but they still return a position even if the item is not in the list. That's perfectly fine for their intended usage, but I just want to know if an item is in the list or not (don't want to insert anything).\nI thought of using bisect_left and then checking if the item at that position is equal to what I'm searching, but that seems cumbersome (and I also need to do bounds checking if the number can be larger than the largest number in my list). If there is a nicer method I'd like to know about it.\nEdit To clarify what I need this for: I'm aware that a dictionary would be very well suited for this, but I'm trying to keep the memory consumption as low as possible. My intended usage would be a sort of double-way look-up table. I have in the table a list of values and I need to be able to access the values based on their index. And also I want to be able to find the index of a particular value or None if the value is not in the list.\nUsing a dictionary for this would be the fastest way, but would (approximately) double the memory requirements.\nI was asking this question thinking that I may have overlooked something in the Python libraries. It seems I'll have to write my own code, as Moe suggested.\n", "output": "from bisect import bisect_left\n\ndef binary_search(a, x, lo=0, hi=None): # can't use a to specify default for hi\n hi = hi if hi is not None else len(a) # hi defaults to len(a) \n pos = bisect_left(a,x,lo,hi) # find insertion position\n return (pos if pos != hi and a[pos] == x else -1) # don't walk off the end\n\n"} {"input": "Installing SetupTools on 64-bit Windows I'm running Python 2.7 on Windows 7 64-bit, and when I run the installer for setuptools it tells me that Python 2.7 is not installed. The specific error message is:\n`Python Version 2.7 required which was not found in the registry`\n\nMy installed version of Python is:\n`Python 2.7 (r27:82525, Jul 4 2010, 07:43:08) [MSC v.1500 64 bit (AMD64)] on win32`\n\nI'm looking at the setuptools site and it doesn't mention any installers for 64-bit Windows. Have I missed something or do I have to install this from source?\n", "output": "Problem: you have 64-bit Python, and a 32-bit installer. This will cause problems for extension modules. \nThe reasons why the installer doesn't finds Python is the transparent 32-bit emulation from Windows 7. 64-bit and 32-bit programs will write to different parts of the Windows registry.\n64-bit: HKLM|HKCU\\SOFTWARE\\\n32-bit: HKLM|HKCU\\SOFTWARE\\wow6432node\\.\nThis means that the 64-bit Python installer writes to HKLM\\SOFTWARE\\Python, but the 32-bit setuptools installer looks at HKLM\\SOFTWARE\\wow6432node\\Python (this is handled by windows automatically, programs don't notice). This is expected behavior and not a bug.\nUsually, you have these choices:\n\nthe \"clean\" way: use 32-bit Python if you have to use 32-bit modules or extensions\nthe other \"clean\" way: only use 64-bit installers when using 64-bit Python (see below)\nwhat the answer above suggests: copy HKLM\\SOFTWARE\\Python to HKLM\\SOFTWARE\\wow6432node\\Python, but this will cause problems with binary distributions, as 64-bit Python can't load 32-bit compiled modules (do NOT do this!)\ninstall pure Python modules with setuptools instead of the distutils installer (easy_install or pip)\n\nFor setuptools itself, for example, you can't use a 32-bit installer for 64-bit Python as it includes binary files. But there's a 64-bit installer at http://www.lfd.uci.edu/~gohlke/pythonlibs/ (has many installers for other modules too). As a rule of thumb: install pure-python modules using distribute and pip (see here), and find a binary package matching your installed Python architecture for everything else.\n"} {"input": "How can I reference requirements.txt for the install_requires kwarg in setuptools' setup.py file? I have a requirements.txt file that I'm using with Travis-CI. It seems silly to duplicate the requirements in both requirements.txt and setup.py, so I was hoping to pass a file handle to the install_requires kwarg in setuptools.setup.\nIs this possible?\nIf so, how should I go about doing it?\nFor good measure, here is my requirements.txt file:\nguessit>=0.5.2\ntvdb_api>=1.8.2\nhachoir-metadata>=1.3.3\nhachoir-core>=1.3.3\nhachoir-parser>=1.3.4\n\n", "output": "A requirement file can contain comments (#) and can include some other files (--requirement or -r).\nThus, if you really want to parse a requirement.txt you should use the pip parser:\nfrom pip.req import parse_requirements\n\n# parse_requirements() returns generator of pip.req.InstallRequirement objects\ninstall_reqs = parse_requirements()\n\n# reqs is a list of requirement\n# e.g. ['django==1.5.1', 'mezzanine==1.4.6']\nreqs = [str(ir.req) for ir in install_reqs]\n\nsetup(\n ...\n install_requires=reqs\n)\n\nNOTE: my answer is old now. Pip does not have a public API so this no longer works (e.g. parse_requirements now requires an instance of pip.download.PipSession). You can do the opposite: list the dependances in setup.py and have a single character -- a dot . -- in requirements.txt.\n"} {"input": "Catching an exception while using a Python 'with' statement To my shame, I can't figure out how to handle exception for python 'with' statement. If I have a code:\nwith open(\"a.txt\") as f:\n print f.readlines()\n\nI really want to handle 'file not found exception' in order to do somehing. But I can't write\nwith open(\"a.txt\") as f:\n print f.readlines()\nexcept:\n print 'oops'\n\nand can't write\nwith open(\"a.txt\") as f:\n print f.readlines()\nelse:\n print 'oops'\n\nenclosing 'with' in a try/except statement doesn't work else: exception is not raised. What can I do in order to process failure inside 'with' statement in a Pythonic way?\n", "output": "from __future__ import with_statement\n\ntry:\n with open( \"a.txt\" ) as f :\n print f.readlines()\nexcept EnvironmentError: # parent of IOError, OSError *and* WindowsError where available\n print 'oops'\n\nIf you want different handling for errors from the open call vs the working code you could do:\ntry:\n f = open('foo.txt')\nexcept IOError:\n print('error')\nelse:\n with f:\n print f.readlines()\n\n"} {"input": "Building a minimal plugin architecture in Python I have an application, written in Python, which is used by a fairly technical audience (scientists). \nI'm looking for a good way to make the application extensible by the users, i.e. a scripting/plugin architecture. \nI am looking for something extremely lightweight. Most scripts, or plugins, are not going to be developed and distributed by a third-party and installed, but are going to be something whipped up by a user in a few minutes to automate a repeating task, add support for a file format, etc. So plugins should have the absolute minimum boilerplate code, and require no 'installation' other than copying to a folder (so something like setuptools entry points, or the Zope plugin architecture seems like too much.)\nAre there any systems like this already out there, or any projects that implement a similar scheme that I should look at for ideas / inspiration?\n", "output": "Mine is, basically, a directory called \"plugins\" which the main app can poll and then use imp.load_module to pick up files, look for a well-known entry point possibly with module-level config params, and go from there. I use file-monitoring stuff for a certain amount of dynamism in which plugins are active, but that's a nice-to-have.\nOf course, any requirement that comes along saying \"I don't need [big, complicated thing] X; I just want something lightweight\" runs the risk of re-implementing X one discovered requirement at a time. But that's not to say you can't have some fun doing it anyway :)\n"} {"input": "Django: Redirect to previous page after login I'm trying to build a simple website with login functionality very similar to the one here on SO.\nThe user should be able to browse the site as an anonymous user and there will be a login link on every page. When clicking on the login link the user will be taken to the login form. After a successful login the user should be taken back to the page from where he clicked the login link in the first place.\nI'm guessing that I have to somehow pass the url of the current page to the view that handles the login form but I can't really get it to work.\nEDIT:\nI figured it out. I linked to the login form by passing the current page as a GET parameter and then used 'next' to redirect to that page. Thanks!\nEDIT 2:\nMy explanation did not seem to be clear so as requested here is my code:\nLets say we are on a page foo.html and we are not logged in. Now we would like to have a link on foo.html that links to login.html. There we can login and are then redirected back to foo.html.\nThe link on foo.html looks like this:\n Login \n\nNow I wrote a custom login view that looks somewhat like this:\ndef login_view(request):\n redirect_to = request.REQUEST.get('next', '')\n if request.method=='POST':\n #create login form...\n if valid login credentials have been entered:\n return HttpResponseRedirect(redirect_to) \n #...\n return render_to_response('login.html', locals())\n\nAnd the important line in login.html:\n
\n\nSo yeah thats pretty much it, hope that makes it clear.\n", "output": "You do not need to make an extra view for this, the functionality is already built in.\nFirst each page with a login link needs to know the current path, and the easiest way is to add the request context preprosessor to settings.py (the 4 first are default), then the request object will be available in each request:\nsettings.py:\nTEMPLATE_CONTEXT_PROCESSORS = (\n \"django.core.context_processors.auth\",\n \"django.core.context_processors.debug\",\n \"django.core.context_processors.i18n\",\n \"django.core.context_processors.media\",\n \"django.core.context_processors.request\",\n)\n\nThen add in the template you want the Login link:\nbase.html:\nLogin\n\nThis will add a GET argument to the login page that points back to the current page.\nThe login template can then be as simple as this:\nregistration/login.html:\n{% block content %}\n\n {{form.as_p}}\n\n
\n{% endblock %}\n\n"} {"input": "Is False == 0 and True == 1 in Python an implementation detail or is it guaranteed by the language? Is it guaranteed that False == 0 and True == 1, in Python? For instance, is it in any way guaranteed that the following code will always produce the same results, whatever the version of Python (both existing and, likely, future ones)?\n0 == False # True\n1 == True # True\n['zero', 'one'][False] # is 'zero'\n\nAny reference to the official documentation would be much appreciated!\nEdit: As noted in many answers, bool inherits from int. The question can therefore be recast as: \"Does the documentation officially say that programmers can rely on booleans inheriting from integers, with the values 0 and 1?\". This question is relevant for writing robust code that won't fail because of implementation details!\n", "output": "In Python 2.x this is not guaranteed as it is possible for True and False to be reassigned. However, even if this happens, boolean True and boolean False are still properly returned for comparisons.\nIn Python 3.x True and False are keywords and will always be equal to 1 and 0.\nUnder normal circumstances in Python 2, and always in Python 3:\nFalse object is of type bool which is a subclass of int:\nobject\n |\n int\n |\n bool\n\nIt is the only reason why in your example, ['zero', 'one'][False] does work. It would not work with an object which is not a subclass of integer, because list indexing only works with integers, or objects that define a __index__ method (thanks mark-dickinson).\nEdit:\nIt is true of the current python version, and of that of Python 3. The docs for python 2.6 and the docs for Python 3 both say:\n\nThere are two types of integers: [...] Integers (int) [...] Booleans (bool)\n\nand in the boolean subsection:\n\nBooleans: These represent the truth values False and True [...] Boolean values behave like the values 0 and 1, respectively, in almost all contexts, the exception being that when converted to a string, the strings \"False\" or \"True\" are returned, respectively.\n\nSo booleans are explicitly considered as integers in Python 2.6 and 3.\nSo you're safe until Python 4 comes along. ;-)\n"} {"input": "Decode HTML entities in Python string? I'm parsing some HTML with Beautiful Soup 3, but it contains HTML entities which Beautiful Soup 3 doesn't automatically decode for me:\n>>> from BeautifulSoup import BeautifulSoup\n\n>>> soup = BeautifulSoup(\"

£682m

\")\n>>> text = soup.find(\"p\").string\n\n>>> print text\n£682m\n\nHow can I decode the HTML entities in text to get \"\u00c2\u00a3682m\" instead of \"£682m\".\n", "output": "Python 3.4+\nHTMLParser.unescape is deprecated, and was supposed to be removed in 3.5, although it was left in by mistake. It will be removed from the language soon. Instead, use html.unescape():\nimport html\nprint(html.unescape('£682m'))\n\nsee https://docs.python.org/3/library/html.html#html.unescape\n\nPython 2.6-3.3\nYou can use the HTML parser from the standard library:\n>>> try:\n... # Python 2.6-2.7 \n... from HTMLParser import HTMLParser\n... except ImportError:\n... # Python 3\n... from html.parser import HTMLParser\n... \n>>> h = HTMLParser()\n>>> print(h.unescape('£682m'))\n\u00c2\u00a3682m\n\nSee http://docs.python.org/2/library/htmlparser.html\nYou can also use the six compatibility library to simplify the import:\n>>> from six.moves.html_parser import HTMLParser\n>>> h = HTMLParser()\n>>> print(h.unescape('£682m'))\n\u00c2\u00a3682m\n\n"} {"input": "Why is it slower to iterate over a small string than a small list? I was playing around with timeit and noticed that doing a simple list comprehension over a small string took longer than doing the same operation on a list of small single character strings. Any explanation? It's almost 1.35 times as much time.\n>>> from timeit import timeit\n>>> timeit(\"[x for x in 'abc']\")\n2.0691067844831528\n>>> timeit(\"[x for x in ['a', 'b', 'c']]\")\n1.5286479570345861\n\nWhat's happening on a lower level that's causing this?\n", "output": "TL;DR\n\nThe actual speed difference is closer to 70% (or more) once a lot of the overhead is removed, for Python 2.\nObject creation is not at fault. Neither method creates a new object, as one-character strings are cached.\nThe difference is unobvious, but is likely created from a greater number of checks on string indexing, with regards to the type and well-formedness. It is also quite likely thanks to the need to check what to return.\nList indexing is remarkably fast.\n\n\n\n>>> python3 -m timeit '[x for x in \"abc\"]'\n1000000 loops, best of 3: 0.388 usec per loop\n\n>>> python3 -m timeit '[x for x in [\"a\", \"b\", \"c\"]]'\n1000000 loops, best of 3: 0.436 usec per loop\n\nThis disagrees with what you've found...\nYou must be using Python 2, then.\n>>> python2 -m timeit '[x for x in \"abc\"]'\n1000000 loops, best of 3: 0.309 usec per loop\n\n>>> python2 -m timeit '[x for x in [\"a\", \"b\", \"c\"]]'\n1000000 loops, best of 3: 0.212 usec per loop\n\nLet's explain the difference between the versions. I'll examine the compiled code.\nFor Python 3:\nimport dis\n\ndef list_iterate():\n [item for item in [\"a\", \"b\", \"c\"]]\n\ndis.dis(list_iterate)\n#>>> 4 0 LOAD_CONST 1 ( at 0x7f4d06b118a0, file \"\", line 4>)\n#>>> 3 LOAD_CONST 2 ('list_iterate..')\n#>>> 6 MAKE_FUNCTION 0\n#>>> 9 LOAD_CONST 3 ('a')\n#>>> 12 LOAD_CONST 4 ('b')\n#>>> 15 LOAD_CONST 5 ('c')\n#>>> 18 BUILD_LIST 3\n#>>> 21 GET_ITER\n#>>> 22 CALL_FUNCTION 1 (1 positional, 0 keyword pair)\n#>>> 25 POP_TOP\n#>>> 26 LOAD_CONST 0 (None)\n#>>> 29 RETURN_VALUE\n\ndef string_iterate():\n [item for item in \"abc\"]\n\ndis.dis(string_iterate)\n#>>> 21 0 LOAD_CONST 1 ( at 0x7f4d06b17150, file \"\", line 21>)\n#>>> 3 LOAD_CONST 2 ('string_iterate..')\n#>>> 6 MAKE_FUNCTION 0\n#>>> 9 LOAD_CONST 3 ('abc')\n#>>> 12 GET_ITER\n#>>> 13 CALL_FUNCTION 1 (1 positional, 0 keyword pair)\n#>>> 16 POP_TOP\n#>>> 17 LOAD_CONST 0 (None)\n#>>> 20 RETURN_VALUE\n\nYou see here that the list variant is likely to be slower due to the building of the list each time.\nThis is the\n 9 LOAD_CONST 3 ('a')\n12 LOAD_CONST 4 ('b')\n15 LOAD_CONST 5 ('c')\n18 BUILD_LIST 3\n\npart. The string variant only has\n 9 LOAD_CONST 3 ('abc')\n\nYou can check that this does seem to make a difference:\ndef string_iterate():\n [item for item in (\"a\", \"b\", \"c\")]\n\ndis.dis(string_iterate)\n#>>> 35 0 LOAD_CONST 1 ( at 0x7f4d068be660, file \"\", line 35>)\n#>>> 3 LOAD_CONST 2 ('string_iterate..')\n#>>> 6 MAKE_FUNCTION 0\n#>>> 9 LOAD_CONST 6 (('a', 'b', 'c'))\n#>>> 12 GET_ITER\n#>>> 13 CALL_FUNCTION 1 (1 positional, 0 keyword pair)\n#>>> 16 POP_TOP\n#>>> 17 LOAD_CONST 0 (None)\n#>>> 20 RETURN_VALUE\n\nThis produces just\n 9 LOAD_CONST 6 (('a', 'b', 'c'))\n\nas tuples are immutable. Test:\n>>> python3 -m timeit '[x for x in (\"a\", \"b\", \"c\")]'\n1000000 loops, best of 3: 0.369 usec per loop\n\nGreat, back up to speed.\nFor Python 2:\ndef list_iterate():\n [item for item in [\"a\", \"b\", \"c\"]]\n\ndis.dis(list_iterate)\n#>>> 2 0 BUILD_LIST 0\n#>>> 3 LOAD_CONST 1 ('a')\n#>>> 6 LOAD_CONST 2 ('b')\n#>>> 9 LOAD_CONST 3 ('c')\n#>>> 12 BUILD_LIST 3\n#>>> 15 GET_ITER \n#>>> >> 16 FOR_ITER 12 (to 31)\n#>>> 19 STORE_FAST 0 (item)\n#>>> 22 LOAD_FAST 0 (item)\n#>>> 25 LIST_APPEND 2\n#>>> 28 JUMP_ABSOLUTE 16\n#>>> >> 31 POP_TOP \n#>>> 32 LOAD_CONST 0 (None)\n#>>> 35 RETURN_VALUE \n\ndef string_iterate():\n [item for item in \"abc\"]\n\ndis.dis(string_iterate)\n#>>> 2 0 BUILD_LIST 0\n#>>> 3 LOAD_CONST 1 ('abc')\n#>>> 6 GET_ITER \n#>>> >> 7 FOR_ITER 12 (to 22)\n#>>> 10 STORE_FAST 0 (item)\n#>>> 13 LOAD_FAST 0 (item)\n#>>> 16 LIST_APPEND 2\n#>>> 19 JUMP_ABSOLUTE 7\n#>>> >> 22 POP_TOP \n#>>> 23 LOAD_CONST 0 (None)\n#>>> 26 RETURN_VALUE \n\nThe odd thing is that we have the same building of the list, but it's still faster for this. Python 2 is acting strangely fast.\nLet's remove the comprehensions and re-time. The _ = is to prevent it getting optimised out.\n>>> python3 -m timeit '_ = [\"a\", \"b\", \"c\"]'\n10000000 loops, best of 3: 0.0707 usec per loop\n\n>>> python3 -m timeit '_ = \"abc\"'\n100000000 loops, best of 3: 0.0171 usec per loop\n\nWe can see that initialization is not significant enough to account for the difference between the versions (those numbers are small)! We can thus conclude that Python 3 has slower comprehensions. This makes sense as Python 3 changed comprehensions to have safer scoping.\nWell, now improve the benchmark (I'm just removing overhead that isn't iteration). This removes the building of the iterable by pre-assigning it:\n>>> python3 -m timeit -s 'iterable = \"abc\"' '[x for x in iterable]'\n1000000 loops, best of 3: 0.387 usec per loop\n\n>>> python3 -m timeit -s 'iterable = [\"a\", \"b\", \"c\"]' '[x for x in iterable]'\n1000000 loops, best of 3: 0.368 usec per loop\n\n>>> python2 -m timeit -s 'iterable = \"abc\"' '[x for x in iterable]'\n1000000 loops, best of 3: 0.309 usec per loop\n\n>>> python2 -m timeit -s 'iterable = [\"a\", \"b\", \"c\"]' '[x for x in iterable]'\n10000000 loops, best of 3: 0.164 usec per loop\n\nWe can check if calling iter is the overhead:\n>>> python3 -m timeit -s 'iterable = \"abc\"' 'iter(iterable)'\n10000000 loops, best of 3: 0.099 usec per loop\n\n>>> python3 -m timeit -s 'iterable = [\"a\", \"b\", \"c\"]' 'iter(iterable)'\n10000000 loops, best of 3: 0.1 usec per loop\n\n>>> python2 -m timeit -s 'iterable = \"abc\"' 'iter(iterable)'\n10000000 loops, best of 3: 0.0913 usec per loop\n\n>>> python2 -m timeit -s 'iterable = [\"a\", \"b\", \"c\"]' 'iter(iterable)'\n10000000 loops, best of 3: 0.0854 usec per loop\n\nNo. No it is not. The difference is too small, especially for Python 3.\nSo let's remove yet more unwanted overhead... by making the whole thing slower! The aim is just to have a longer iteration so the time hides overhead.\n>>> python3 -m timeit -s 'import random; iterable = \"\".join(chr(random.randint(0, 127)) for _ in range(100000))' '[x for x in iterable]'\n100 loops, best of 3: 3.12 msec per loop\n\n>>> python3 -m timeit -s 'import random; iterable = [chr(random.randint(0, 127)) for _ in range(100000)]' '[x for x in iterable]'\n100 loops, best of 3: 2.77 msec per loop\n\n>>> python2 -m timeit -s 'import random; iterable = \"\".join(chr(random.randint(0, 127)) for _ in range(100000))' '[x for x in iterable]'\n100 loops, best of 3: 2.32 msec per loop\n\n>>> python2 -m timeit -s 'import random; iterable = [chr(random.randint(0, 127)) for _ in range(100000)]' '[x for x in iterable]'\n100 loops, best of 3: 2.09 msec per loop\n\nThis hasn't actually changed much, but it's helped a little.\nSo remove the comprehension. It's overhead that's not part of the question:\n>>> python3 -m timeit -s 'import random; iterable = \"\".join(chr(random.randint(0, 127)) for _ in range(100000))' 'for x in iterable: pass'\n1000 loops, best of 3: 1.71 msec per loop\n\n>>> python3 -m timeit -s 'import random; iterable = [chr(random.randint(0, 127)) for _ in range(100000)]' 'for x in iterable: pass'\n1000 loops, best of 3: 1.36 msec per loop\n\n>>> python2 -m timeit -s 'import random; iterable = \"\".join(chr(random.randint(0, 127)) for _ in range(100000))' 'for x in iterable: pass'\n1000 loops, best of 3: 1.27 msec per loop\n\n>>> python2 -m timeit -s 'import random; iterable = [chr(random.randint(0, 127)) for _ in range(100000)]' 'for x in iterable: pass'\n1000 loops, best of 3: 935 usec per loop\n\nThat's more like it! We can get slightly faster still by using deque to iterate. It's basically the same, but it's faster:\n>>> python3 -m timeit -s 'import random; from collections import deque; iterable = \"\".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'\n1000 loops, best of 3: 777 usec per loop\n\n>>> python3 -m timeit -s 'import random; from collections import deque; iterable = [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'\n1000 loops, best of 3: 405 usec per loop\n\n>>> python2 -m timeit -s 'import random; from collections import deque; iterable = \"\".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'\n1000 loops, best of 3: 805 usec per loop\n\n>>> python2 -m timeit -s 'import random; from collections import deque; iterable = [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'\n1000 loops, best of 3: 438 usec per loop\n\nWhat impresses me is that Unicode is competitive with bytestrings. We can check this explicitly by trying bytes and unicode in both:\n\nbytes\n>>> python3 -m timeit -s 'import random; from collections import deque; iterable = b\"\".join(chr(random.randint(0, 127)).encode(\"ascii\") for _ in range(100000))' 'deque(iterable, maxlen=0)' :(\n1000 loops, best of 3: 571 usec per loop\n\n>>> python3 -m timeit -s 'import random; from collections import deque; iterable = [chr(random.randint(0, 127)).encode(\"ascii\") for _ in range(100000)]' 'deque(iterable, maxlen=0)'\n1000 loops, best of 3: 394 usec per loop\n\n>>> python2 -m timeit -s 'import random; from collections import deque; iterable = b\"\".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'\n1000 loops, best of 3: 757 usec per loop\n\n>>> python2 -m timeit -s 'import random; from collections import deque; iterable = [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'\n1000 loops, best of 3: 438 usec per loop\n\nHere you see Python 3 actually faster than Python 2.\nunicode\n>>> python3 -m timeit -s 'import random; from collections import deque; iterable = u\"\".join( chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'\n1000 loops, best of 3: 800 usec per loop\n\n>>> python3 -m timeit -s 'import random; from collections import deque; iterable = [ chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'\n1000 loops, best of 3: 394 usec per loop\n\n>>> python2 -m timeit -s 'import random; from collections import deque; iterable = u\"\".join(unichr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'\n1000 loops, best of 3: 1.07 msec per loop\n\n>>> python2 -m timeit -s 'import random; from collections import deque; iterable = [unichr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'\n1000 loops, best of 3: 469 usec per loop\n\nAgain, Python 3 is faster, although this is to be expected (str has had a lot of attention in Python 3).\n\nIn fact, this unicode-bytes difference is very small, which is impressive.\nSo let's analyse this one case, seeing as it's fast and convenient for me:\n>>> python3 -m timeit -s 'import random; from collections import deque; iterable = \"\".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'\n1000 loops, best of 3: 777 usec per loop\n\n>>> python3 -m timeit -s 'import random; from collections import deque; iterable = [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'\n1000 loops, best of 3: 405 usec per loop\n\nWe can actually rule out Tim Peter's 10-times-upvoted answer!\n>>> foo = iterable[123]\n>>> iterable[36] is foo\nTrue\n\nThese are not new objects!\nBut this is worth mentioning: indexing costs. The difference will likely be in the indexing, so remove the iteration and just index:\n>>> python3 -m timeit -s 'import random; iterable = \"\".join(chr(random.randint(0, 127)) for _ in range(100000))' 'iterable[123]'\n10000000 loops, best of 3: 0.0397 usec per loop\n\n>>> python3 -m timeit -s 'import random; iterable = [chr(random.randint(0, 127)) for _ in range(100000)]' 'iterable[123]'\n10000000 loops, best of 3: 0.0374 usec per loop\n\nThe difference seems small, but at least half of the cost is overhead:\n>>> python3 -m timeit -s 'import random; iterable = [chr(random.randint(0, 127)) for _ in range(100000)]' 'iterable; 123'\n100000000 loops, best of 3: 0.0173 usec per loop\n\nso the speed difference is sufficient to decide to blame it. I think.\nSo why is indexing a list so much faster?\nWell, I'll come back to you on that, but my guess is that's is down to the check for interned strings (or cached characters if it's a separate mechanism). This will be less fast than optimal. But I'll go check the source (although I'm not comfortable in C...) :).\n\nSo here's the source:\nstatic PyObject *\nunicode_getitem(PyObject *self, Py_ssize_t index)\n{\n void *data;\n enum PyUnicode_Kind kind;\n Py_UCS4 ch;\n PyObject *res;\n\n if (!PyUnicode_Check(self) || PyUnicode_READY(self) == -1) {\n PyErr_BadArgument();\n return NULL;\n }\n if (index < 0 || index >= PyUnicode_GET_LENGTH(self)) {\n PyErr_SetString(PyExc_IndexError, \"string index out of range\");\n return NULL;\n }\n kind = PyUnicode_KIND(self);\n data = PyUnicode_DATA(self);\n ch = PyUnicode_READ(kind, data, index);\n if (ch < 256)\n return get_latin1_char(ch);\n\n res = PyUnicode_New(1, ch);\n if (res == NULL)\n return NULL;\n kind = PyUnicode_KIND(res);\n data = PyUnicode_DATA(res);\n PyUnicode_WRITE(kind, data, 0, ch);\n assert(_PyUnicode_CheckConsistency(res, 1));\n return res;\n}\n\nWalking from the top, we'll have some checks. These are boring. Then some assigns, which should also be boring. The first interesting line is\nch = PyUnicode_READ(kind, data, index);\n\nbut we'd hope that is fast, as we're reading from a contiguous C array by indexing it. The result, ch, will be less than 256 so we'll return the cached character in get_latin1_char(ch).\nSo we'll run (dropping the first checks)\nkind = PyUnicode_KIND(self);\ndata = PyUnicode_DATA(self);\nch = PyUnicode_READ(kind, data, index);\nreturn get_latin1_char(ch);\n\nWhere\n#define PyUnicode_KIND(op) \\\n (assert(PyUnicode_Check(op)), \\\n assert(PyUnicode_IS_READY(op)), \\\n ((PyASCIIObject *)(op))->state.kind)\n\n(which is boring because asserts get ignored in debug [so I can check that they're fast] and ((PyASCIIObject *)(op))->state.kind) is (I think) an indirection and a C-level cast);\n#define PyUnicode_DATA(op) \\\n (assert(PyUnicode_Check(op)), \\\n PyUnicode_IS_COMPACT(op) ? _PyUnicode_COMPACT_DATA(op) : \\\n _PyUnicode_NONCOMPACT_DATA(op))\n\n(which is also boring for similar reasons, assuming the macros (Something_CAPITALIZED) are all fast),\n#define PyUnicode_READ(kind, data, index) \\\n ((Py_UCS4) \\\n ((kind) == PyUnicode_1BYTE_KIND ? \\\n ((const Py_UCS1 *)(data))[(index)] : \\\n ((kind) == PyUnicode_2BYTE_KIND ? \\\n ((const Py_UCS2 *)(data))[(index)] : \\\n ((const Py_UCS4 *)(data))[(index)] \\\n ) \\\n ))\n\n(which involves indexes but really isn't slow at all) and\nstatic PyObject*\nget_latin1_char(unsigned char ch)\n{\n PyObject *unicode = unicode_latin1[ch];\n if (!unicode) {\n unicode = PyUnicode_New(1, ch);\n if (!unicode)\n return NULL;\n PyUnicode_1BYTE_DATA(unicode)[0] = ch;\n assert(_PyUnicode_CheckConsistency(unicode, 1));\n unicode_latin1[ch] = unicode;\n }\n Py_INCREF(unicode);\n return unicode;\n}\n\nWhich confirms my suspicion that:\n\nThis is cached:\nPyObject *unicode = unicode_latin1[ch];\n\nThis should be fast. The if (!unicode) is not run, so it's literally equivalent in this case to\nPyObject *unicode = unicode_latin1[ch];\nPy_INCREF(unicode);\nreturn unicode;\n\n\nHonestly, after testing the asserts are fast (by disabling them [I think it works on the C-level asserts...]), the only plausibly-slow parts are:\nPyUnicode_IS_COMPACT(op)\n_PyUnicode_COMPACT_DATA(op)\n_PyUnicode_NONCOMPACT_DATA(op)\n\nWhich are:\n#define PyUnicode_IS_COMPACT(op) \\\n (((PyASCIIObject*)(op))->state.compact)\n\n(fast, as before),\n#define _PyUnicode_COMPACT_DATA(op) \\\n (PyUnicode_IS_ASCII(op) ? \\\n ((void*)((PyASCIIObject*)(op) + 1)) : \\\n ((void*)((PyCompactUnicodeObject*)(op) + 1)))\n\n(fast if the macro IS_ASCII is fast), and\n#define _PyUnicode_NONCOMPACT_DATA(op) \\\n (assert(((PyUnicodeObject*)(op))->data.any), \\\n ((((PyUnicodeObject *)(op))->data.any)))\n\n(also fast as it's an assert plus an indirection plus a cast).\nSo we're down (the rabbit hole) to:\nPyUnicode_IS_ASCII\n\nwhich is\n#define PyUnicode_IS_ASCII(op) \\\n (assert(PyUnicode_Check(op)), \\\n assert(PyUnicode_IS_READY(op)), \\\n ((PyASCIIObject*)op)->state.ascii)\n\nHmm... that seems fast too...\n\nWell, OK, but let's compare it to PyList_GetItem. (Yeah, thanks Tim Peters for giving me more work to do :P.)\nPyObject *\nPyList_GetItem(PyObject *op, Py_ssize_t i)\n{\n if (!PyList_Check(op)) {\n PyErr_BadInternalCall();\n return NULL;\n }\n if (i < 0 || i >= Py_SIZE(op)) {\n if (indexerr == NULL) {\n indexerr = PyUnicode_FromString(\n \"list index out of range\");\n if (indexerr == NULL)\n return NULL;\n }\n PyErr_SetObject(PyExc_IndexError, indexerr);\n return NULL;\n }\n return ((PyListObject *)op) -> ob_item[i];\n}\n\nWe can see that on non-error cases this is just going to run:\nPyList_Check(op)\nPy_SIZE(op)\n((PyListObject *)op) -> ob_item[i]\n\nWhere PyList_Check is\n#define PyList_Check(op) \\\n PyType_FastSubclass(Py_TYPE(op), Py_TPFLAGS_LIST_SUBCLASS)\n\n(TABS! TABS!!!) (issue21587) That got fixed and merged in 5 minutes. Like... yeah. Damn. They put Skeet to shame.\n#define Py_SIZE(ob) (((PyVarObject*)(ob))->ob_size)\n\n#define PyType_FastSubclass(t,f) PyType_HasFeature(t,f)\n\n#ifdef Py_LIMITED_API\n#define PyType_HasFeature(t,f) ((PyType_GetFlags(t) & (f)) != 0)\n#else\n#define PyType_HasFeature(t,f) (((t)->tp_flags & (f)) != 0)\n#endif\n\nSo this is normally really trivial (two indirections and a couple of boolean checks) unless Py_LIMITED_API is on, in which case... ???\nThen there's the indexing and a cast (((PyListObject *)op) -> ob_item[i]) and we're done.\nSo there are definitely fewer checks for lists, and the small speed differences certainly imply that it could be relevant.\n\nI think in general, there's just more type-checking and indirection (->) for Unicode. It seems I'm missing a point, but what?\n"} {"input": "Django 1.7 throws django.core.exceptions.AppRegistryNotReady: Models aren't loaded yet This is the traceback on my windows system. \nTraceback (most recent call last):\n File \"D:\\AMD\\workspace\\steelrumors\\manage.py\", line 9, in \n django.setup()\n File \"D:\\AMD\\Django\\django-django-4c85a0d\\django\\__init__.py\", line 21, in setup\n apps.populate(settings.INSTALLED_APPS)\n File \"D:\\AMD\\Django\\django-django-4c85a0d\\django\\apps\\registry.py\", line 108, in populate\n app_config.import_models(all_models)\n File \"D:\\AMD\\Django\\django-django-4c85a0d\\django\\apps\\config.py\", line 197, in import_models\n self.models_module = import_module(models_module_name)\n File \"C:\\Python27\\lib\\importlib\\__init__.py\", line 37, in import_module\n __import__(name)\n File \"C:\\Python27\\lib\\site-packages\\registration\\models.py\", line 15, in \n User = get_user_model()\n File \"D:\\AMD\\Django\\django-django-4c85a0d\\django\\contrib\\auth\\__init__.py\", line 135, in get_user_model\n return django_apps.get_model(settings.AUTH_USER_MODEL)\n File \"D:\\AMD\\Django\\django-django-4c85a0d\\django\\apps\\registry.py\", line 199, in get_model\n self.check_models_ready()\n File \"D:\\AMD\\Django\\django-django-4c85a0d\\django\\apps\\registry.py\", line 131, in check_models_ready\n raise AppRegistryNotReady(\"Models aren't loaded yet.\")\ndjango.core.exceptions.AppRegistryNotReady: Models aren't loaded yet.\n\nAnd my manage.py looks like this:\nimport os\nimport sys\nimport django\n\nif __name__ == \"__main__\":\n\n os.environ.setdefault(\"DJANGO_SETTINGS_MODULE\", \"steelrumors.settings\")\n django.setup()\n from django.core.management import execute_from_command_line\n\n execute_from_command_line(sys.argv)\n\nI get this error when i am trying to use registration app in Django 1.7\n", "output": "Running these commands solved my problem (credit to this answer):\nimport django\ndjango.setup()\n\nHowever I'm not sure why I need this. Comments would be appreciated.\n"} {"input": "Escaping regex string in Python I want to use input from a user as a regex pattern for a search over some text. It works, but how I can handle cases where user puts characters that have meaning in regex? For example, the user wants to search for Word (s): regex engine will take the (s) as a group. I want it to treat it like a string \"(s)\" . I can run replace on user input and replace the ( with \\( and the ) with \\) but the problem is I will need to do replace for every possible regex symbol. Do you know some better way ?\n", "output": "Use the re.escape() function for this:\n4.2.3 re Module Contents\n\nescape(string)\nReturn string with all non-alphanumerics backslashed; this is useful if you want to match an arbitrary literal string that may have regular expression metacharacters in it.\n\nA simplistic example, search any occurence of the provided string optionally followed by 's', and return the match object.\ndef simplistic_plural(word, text):\n word_or_plural = re.escape(word) + 's?'\n return re.match(word_or_plural, text)\n\n"} {"input": "Python subprocess/Popen with a modified environment I believe that running an external command with a slightly modified environment is a very common case. That's how I tend to do it:\nimport subprocess, os\nmy_env = os.environ\nmy_env[\"PATH\"] = \"/usr/sbin:/sbin:\" + my_env[\"PATH\"]\nsubprocess.Popen(my_command, env=my_env)\n\nI've got a gut feeling that there's a better way; does it look alright?\n", "output": "I think os.environ.copy() is better if you don't intend to modify the os.environ for the current process:\nimport subprocess, os\nmy_env = os.environ.copy()\nmy_env[\"PATH\"] = \"/usr/sbin:/sbin:\" + my_env[\"PATH\"]\nsubprocess.Popen(my_command, env=my_env)\n\n"} {"input": "Add Variables to Tuple I am learning Python and creating a database connection.\nWhile trying to add to the DB, I am thinking of creating tuples out of information and then add them to the DB. \nWhat I am Doing:\nI am taking information from the user and store it in variables. \nCan I add these variables into a tuple? Can you please help me with the syntax?\nAlso if there is an efficient way of doing this, please share...\nEDIT\nLet me edit this question a bit...I only need the tuple to enter info into the DB. Once the information is added to the DB, should I delete the tuple? I mean I don't need the tuple anymore.\n", "output": "Tuples are immutable; you can't change which variables they contain after construction. However, you can concatenate or slice them to form new tuples:\na = (1, 2, 3)\nb = a + (4, 5, 6)\nc = b[1:]\n\nAnd, of course, build them from existing values:\n name = \"Joe\"\n age = 40\n location = \"New York\"\n joe = (name, age, location)\n\n"} {"input": "What is the fastest way to send 100,000 HTTP requests in Python? I am opening a file which has 100,000 url's. I need to send an http request to each url and print the status code. I am using Python 2.6, and so far looked at the many confusing ways Python implements threading/concurrency. I have even looked at the python concurrence library, but cannot figure out how to write this program correctly. Has anyone come across a similar problem? I guess generally I need to know how to perform thousands of tasks in Python as fast as possible - I suppose that means 'concurrently'.\nThank you,\nIgor\n", "output": "Twistedless solution:\nfrom urlparse import urlparse\nfrom threading import Thread\nimport httplib, sys\nfrom Queue import Queue\n\nconcurrent = 200\n\ndef doWork():\n while True:\n url = q.get()\n status, url = getStatus(url)\n doSomethingWithResult(status, url)\n q.task_done()\n\ndef getStatus(ourl):\n try:\n url = urlparse(ourl)\n conn = httplib.HTTPConnection(url.netloc) \n conn.request(\"HEAD\", url.path)\n res = conn.getresponse()\n return res.status, ourl\n except:\n return \"error\", ourl\n\ndef doSomethingWithResult(status, url):\n print status, url\n\nq = Queue(concurrent * 2)\nfor i in range(concurrent):\n t = Thread(target=doWork)\n t.daemon = True\n t.start()\ntry:\n for url in open('urllist.txt'):\n q.put(url.strip())\n q.join()\nexcept KeyboardInterrupt:\n sys.exit(1)\n\nThis one is slighty faster than the twisted solution and uses less CPU.\n"}