Barry Warsaw is the release manager for Python 2.6.9, so we looked at various release-blocking issues for 2.6.9. The bulk of them at this time are various DDoS possibilities where
.readline() is called with no size limitation. An attacker can then feed large amounts of data and consuming an arbitrary amount of memory. The fixes all take the same general pattern: add a size limitation to the
.readline() call that varies depending on the protocol, and then report an error if the line hits the limit.
I also committed the
traceback.clear_frames() function that I was working on last week.
- Describe the errors="surrogateescape" encoding parameter.
- Fixed a memory leak in curses.panel, though Serhiy Storchaka noted that the resulting bugfix can be used to trigger a segfault. I posted a possible fix for review.
I should probably just commit my edits to the Unicode howto at this point, and then move on to updating the regex howto.
Outline --------- Written Sat. Aug 30 2003. "Algorithms in Python" (alt. title "Learning Algorithms with Python", if Sedgewick has a trademark) is a book that uses Python as the vehicle to teach the basic ideas of algorithms -- what an algorithm is, how to notate them clearly, what O() notation means, etc. -- and to introduce a number of different algorithms. Discussions will rarely or never be rigorous. Requirements: * Minimal mathematical background should be required (basically functions, I think). Those bits of notation that are necessary will be explained as they're needed. * Elementary Python programming knowledge is needed, so readers should have a tutorial such as "Learning Python" alongside this book; it will not attempt to teach Python, though it will introduce relevant standard library modules. * Python 2.3 will be used. The goal is to make the book understandable by a bright high-school student. If the student goes on to formal computer science study, she will have a good foundation for it; if not, she'll understand some basic computer ideas. Exercises may be included (I haven't decided yet); they might be hard to invent, but they'd also be very useful for the book's use in teaching settings. If exercises are provided, I'll write up a set of unit tests for correct implementations; this will make it easy for teachers to check whether answers are correct and let them concentrate on programming style. It might be better to leave exercises out of the book and put them on a separate web site, where they can be continually updated. * Preface (the usual front matter) * Introductory comments * Intended audience; prerequisites; goals of book * Outline of book * Typographical conventions * Acknowledgements * Algorithmic Analysis (alt. title "What is an Algorithm"?) * Basic idea: an algorithm is a series of steps to perform a task * Example: finding the largest number in a list * Iterative formulation * Recursive formulation * Measuring time complexity: O, Theta, Omega notation * Compare different time complexities: O(n) vs O(lg n) * Applying O() to memory/space complexity * Time complexity of various Python built-in operations (dicts, lists) * Hashing Hashing is going to be the first serious algorithm used as an example, so it's going to be worked out in the most detail. Later chapters will explain algorithms, give reasons why they work, and discuss their O() complexity, but in less detail than in this chapter. * Basic concept of hash tables * Computing hash codes * Handling collisions * Resizing * Amortized algorithm costs (I can't see how to fit this into chapter 1, since you need a reasonably complicated algorithm to use as an example,) * Deleting hash table entries * Determining time complexity of hashing (this would be the most detailed explanation of determining time complexity; remaining chapters would be more hand-waving) * Graphs * Graph concepts * Different graph representations (Node objects, sets of arcs) * Traversal * Topological sorts * Example: working out file dependencies * Connected-components * Spanning trees * Shortest paths * Trees * Trees as a special case of graphs * Binary trees * Representations (Node objects, lists/tuples, tables) * Searching * Inserting * Deleting * Unbalanced trees * Recursive operations on trees * Balanced trees: * red-black * do AVL trees, or just mention them briefly? (Briefly, I think; there's already more than enough material in this outline!) * B-trees * Numbers (should the numeric material come before trees and graphs?) * Representing numbers on computers * Machine integers * Floating point representation * Large integers * Random number generation * Numeric Analysis * Polynomial evaluation * Finding zeros of functions * Differentiating functions * Integrating functions * Sorting * Basic concepts * Comparing * Stability * Simple algorithms: * Bubble sort * Shell sort * Insertion sort * Merge sort * Quicksort * Implementation * Issues * Time-complexity * Time complexity of sorting * Proof of O(n lg n) bound. * Breaking assumptions: parallellism, spaghetti sort Hard problems: NP-completeness * P and NP * NP-completeness * Explanation * Various examples of NP-complete problems * Show that all NP-complete problems are equivalent * Solving an NP-complete problem * Exhaustive search * Heuristics * Final thoughts I like books that close with some sort of summation. This brief final chapter will discuss a few general issues. * Think about time complexity * Do the simplest thing... * Write tests * Value clarity over optimization * Where to go from here? * More books to read; things to do * Current state of research (parallel, distributed, various domains) Optional Topics -------------------- This section lists various chapters for which I produced an outline but later decided aren't a core part of the book. Any of these chapters can be reintroduced at a later point. * Number Theory * Finite fields * Primality testing (kind of an odd duck) * RSA Geometric Algorithms: * Point representations * Convex hull * Line intersections * Range searching * Strings This will likely be a difficult chapter, both to write and to read, because FSMs will be a pain to explain. (Indeed, you could envision an entirely different document that covered FSMs, languages, Turing completeness, parsing, ... One for the sequel, I think.) * Simple searching * Boyer-Moore/KMP searches * Data compression * Regular expressions * Simple regex patterns * Finite automata * Languages * What finite automata can't do * Brief theory-of-computation overview * Real Implementations: Examines real-world implementation of various algorithms. * Random number generation: Mersenne twister * Karatsuba algorithm? (maybe too complicated and boring?) * Python's dictionary hashing * Python's sort * Game trees * Introduction * Game rules (tic-tac-toe; or maybe hnefatafl, a game I did for a school project once) * Simple tree search * Alpha/beta cutoff Possible Topics -------------------- * Heaps (can't figure out where to fit them in -- any suggestions?) Not Covered ------------------ * Linked lists (they aren't the source of many algorithms, and their implementation in Python doesn't seem very interesting)
I used the 'random issue' link on bugs.python.org heavily, and it worked really well because it showed me bugs that I wouldn't normally look at. Often it landed on bugs I could do nothing with, such as bugs for Windows or AIX or FreeBSD 4. Once I read the subject line, which was something like "cProfile doesn't support multiple thread stacks" and immediately whimpered in terror and hit 'random issue' again.
I trivially closed a few bugs that were already fixed or not actually bugs, found another patch that had been approved and could be committed (Antoine Pitrou did the actual commit), and inconclusively assessed a few patches.
Socializing on IRC was also fun. I wish we did bug days more frequently. Maybe I can nudge the DC Python group to have a weekend bug day in January or February.