akuchling: Sherlock Hemlock (Default)
Collects notes for an upcoming sprint where our goal will be to write an SMTP daemon using asyncio. SMTP: RFC 5321

Asyncio:

akuchling: Sherlock Hemlock (Default)
Last week the BACON-PIG decided to have an informal sprint to hack on core Python, and today we held it at my (newly-purchased!) house.

Barry Warsaw is the release manager for Python 2.6.9, so we looked at various release-blocking issues for 2.6.9. The bulk of them at this time are various DDoS possibilities where .readline() is called with no size limitation. An attacker can then feed large amounts of data and consuming an arbitrary amount of memory. The fixes all take the same general pattern: add a size limitation to the .readline() call that varies depending on the protocol, and then report an error if the line hits the limit.

I also committed the traceback.clear_frames() function that I was working on last week.

akuchling: Sherlock Hemlock (Default)
Today I was poking around on the Python bug tracker and noticed issue 1565525, which dates back to 2006 and the old SourceForge bug tracker. The issue has a lot of discussion, but today it boils down to adding a helper function to clear all the frames in a traceback. That's a straightforward task, so I implemented it and posted a patch.
akuchling: Sherlock Hemlock (Default)
Over the weekend I worked on various Python bugs.
I should probably just commit my edits to the Unicode howto at this point, and then move on to updating the regex howto.
akuchling: Sherlock Hemlock (Default)
Another book project: this was written as a book proposal aimed at O'Reilly, for an actual printed book (back when people actually bought printed computer books). I wrote two sections, defining an algorithm and explaining O() notation; parts of these two sections live on in my 50-examples book.
Outline
---------

Written Sat. Aug 30 2003.

"Algorithms in Python" (alt. title "Learning Algorithms with Python", if
Sedgewick has a trademark) is a book that uses Python as the vehicle to
teach the basic ideas of algorithms -- what an algorithm is, how to
notate them clearly, what O() notation means, etc. -- and to introduce a
number of different algorithms.

Discussions will rarely or never be rigorous.  Requirements:

* Minimal mathematical background should be required (basically
  functions, I think).  Those bits of notation that are necessary will
  be explained as they're needed.
* Elementary Python programming knowledge is needed, so readers
  should have a tutorial such as "Learning Python" alongside this book;
  it will not attempt to teach Python, though it will introduce relevant
  standard library modules.  
* Python 2.3 will be used.

The goal is to make the book understandable by a bright high-school
student.  If the student goes on to formal computer science study, she
will have a good foundation for it; if not, she'll understand some basic
computer ideas.

Exercises may be included (I haven't decided yet); they might be hard to
invent, but they'd also be very useful for the book's use in teaching
settings.  If exercises are provided, I'll write up a set of unit tests
for correct implementations; this will make it easy for teachers to
check whether answers are correct and let them concentrate on
programming style.  It might be better to leave exercises out of the
book and put them on a separate web site, where they can be continually
updated.

* Preface (the usual front matter)
  * Introductory comments
    * Intended audience; prerequisites; goals of book
  * Outline of book
  * Typographical conventions
  * Acknowledgements

* Algorithmic Analysis (alt. title "What is an Algorithm"?)
  * Basic idea: an algorithm is a series of steps to perform a task
  * Example: finding the largest number in a list
    * Iterative formulation
    * Recursive formulation
  * Measuring time complexity: O, Theta, Omega notation
    * Compare different time complexities: O(n) vs O(lg n)
    * Applying O() to memory/space complexity
  * Time complexity of various Python built-in operations (dicts, lists)

* Hashing 
  Hashing is going to be the first serious algorithm used as an
  example, so it's going to be worked out in the most detail.  Later
  chapters will explain algorithms, give reasons why they work, 
  and discuss their O() complexity, but in less detail than in this chapter.
  * Basic concept of hash tables
  * Computing hash codes
  * Handling collisions
  * Resizing 
    * Amortized algorithm costs (I can't see how to fit this into
      chapter 1, since you need a reasonably complicated algorithm to
      use as an example,)
  * Deleting hash table entries
  * Determining time complexity of hashing 
    (this would be the most detailed explanation of determining time complexity; 
     remaining chapters would be more hand-waving)

* Graphs
  * Graph concepts
  * Different graph representations (Node objects, sets of arcs)
  * Traversal
  * Topological sorts
    * Example: working out file dependencies
  * Connected-components
  * Spanning trees
  * Shortest paths

* Trees
  * Trees as a special case of graphs
  * Binary trees
    * Representations (Node objects, lists/tuples, tables)
    * Searching
    * Inserting
    * Deleting
    * Unbalanced trees
  * Recursive operations on trees
  * Balanced trees: 
    * red-black
    * do AVL trees, or just mention them briefly?  
      (Briefly, I think; there's already more than enough material in this outline!)
    * B-trees
  
* Numbers (should the numeric material come before trees and graphs?)
  * Representing numbers on computers
    * Machine integers
    * Floating point representation
    * Large integers
  * Random number generation
  
* Numeric Analysis
  * Polynomial evaluation
  * Finding zeros of functions
  * Differentiating functions
  * Integrating functions

* Sorting
  * Basic concepts
    * Comparing
    * Stability
  * Simple algorithms:
    * Bubble sort
    * Shell sort
    * Insertion sort
    * Merge sort
  * Quicksort
    * Implementation
    * Issues
    * Time-complexity
  * Time complexity of sorting
    * Proof of O(n lg n) bound.
    * Breaking assumptions: parallellism, spaghetti sort

Hard problems: NP-completeness
  * P and NP
  * NP-completeness
    * Explanation
    * Various examples of NP-complete problems
    * Show that all NP-complete problems are equivalent
  * Solving an NP-complete problem
    * Exhaustive search
    * Heuristics

* Final thoughts 
  I like books that close with some sort of summation. 
  This brief final chapter will discuss a few general issues.
  * Think about time complexity
  * Do the simplest thing...
  * Write tests
  * Value clarity over optimization
  * Where to go from here?
    * More books to read; things to do
    * Current state of research (parallel, distributed, various domains)


Optional Topics
--------------------

This section lists various chapters for which I produced an outline 
but later decided aren't a core part of the book.  Any of these chapters can 
be reintroduced at a later point.

* Number Theory
  * Finite fields
  * Primality testing (kind of an odd duck)
  * RSA
  
Geometric Algorithms:
  * Point representations
  * Convex hull
  * Line intersections
  * Range searching

* Strings 
  This will likely be a difficult chapter, both to write and to read,
  because FSMs will be a pain to explain.  (Indeed, you could envision
  an entirely different document that covered FSMs, languages, Turing
  completeness, parsing, ...  One for the sequel, I think.)
  * Simple searching
  * Boyer-Moore/KMP searches
  * Data compression
  * Regular expressions
    * Simple regex patterns
    * Finite automata
    * Languages
      * What finite automata can't do
    * Brief theory-of-computation overview

* Real Implementations:
  Examines real-world implementation of various algorithms.
  * Random number generation: Mersenne twister
  * Karatsuba algorithm?  (maybe too complicated and boring?)
  * Python's dictionary hashing
  * Python's sort

* Game trees
  * Introduction
  * Game rules (tic-tac-toe; or maybe hnefatafl, a game I did for a school project once)
  * Simple tree search
  * Alpha/beta cutoff
  


Possible Topics
--------------------

* Heaps (can't figure out where to fit them in -- any suggestions?)


Not Covered
------------------

* Linked lists (they aren't the source of many algorithms,
  and their implementation in Python doesn't seem very interesting)
akuchling: Sherlock Hemlock (Default)
The morning was spent biking on the trail that loops around BWI airport, but in the afternoon I signed on to IRC for the Python Bug Day spearheaded by the Montreal user group.

I used the 'random issue' link on bugs.python.org heavily, and it worked really well because it showed me bugs that I wouldn't normally look at. Often it landed on bugs I could do nothing with, such as bugs for Windows or AIX or FreeBSD 4. Once I read the subject line, which was something like "cProfile doesn't support multiple thread stacks" and immediately whimpered in terror and hit 'random issue' again.

I trivially closed a few bugs that were already fixed or not actually bugs, found another patch that had been approved and could be committed (Antoine Pitrou did the actual commit), and inconclusively assessed a few patches.

Socializing on IRC was also fun. I wish we did bug days more frequently. Maybe I can nudge the DC Python group to have a weekend bug day in January or February.

Profile

akuchling: Sherlock Hemlock (Default)
akuchling

September 2016

S M T W T F S
    123
45678910
11121314151617
18192021222324
2526272829 30 

Syndicate

RSS Atom

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 21st, 2017 08:33 pm
Powered by Dreamwidth Studios