Large-scale User Object Creation with PERL
Novell Cool Solutions: Feature
By Ian Pattison
Digg This -
Posted: 12 Oct 2005
Perl and LDAP save the day for Ottawa's French Catholic School Board
Handling large-scale user object creation - by Ian Pattison, with contributions from Thomas Rogan
This past summer, the Consiel des écoles catholiques de langue française de Centre-Est (the French Catholic school board in eastern Ontario [Ottawa]) opted to replace the student servers at each school, approximately fifty-three of them, with eight IBM blade servers at the central board offices in Ottawa. While this necessitated a great deal of planning for things like bandwidth, resource allocation, and IT staffing, the new challenge was creating user accounts for the 17,000 students and 1,500 teachers who were to use the system in the 05/06 school year.
In the previous environment, each school had its own server and its own eDirectory tree. The local school administrators would manually create all the necessary user accounts each August. With a single tree and over 18,500 users, this was no longer a viable option and available prepackaged solutions were not able to handle the clients special requirements so, a custom option was needed.
Perl, the Practical Extraction and Reporting Language, was specifically designed to address these types of problems. Since eDirectory complies with the Lightweight Directory Access Protocol (LDAP), it was quickly determined that using LDAP to import the users would be the easiest way to go. Someone suggested that we use the built-in functionality in Perl to insert the users directly to the tree, but instead we used ICE to import an LDIF file generated by a Perl script.
Cleaning the Input
The available raw data posed a bit of a problem for us. Board personnel exported the complete list of students from their student enrollment management package and provided us with an Excel file. This file contained the following columns: School Name, Student Number, Student Last Name, Student First Name(s) and Grade.
The first name fields posed the first problem; the first name column contained 'First Name' OR 'First Name Middle Name' OR 'First Name, Middle Name' OR 'First Name, Middle Name, Middle Name, Middle Name, Middle Name'. Because we were planning to convert the Excel file to Comma-Separated Values for our Perl script, the user names had to be cleaned up. The only manual pre-processing was to do a find-and-replace to replace all commas with null characters and then save the file as CSV. Now, the script can take over.
The next problem was that the school names listed in the CSV did not match the Organizational Unit names already existing in the tree. In addition, most of the fields in the CSV contained extraneous spaces. A series of IF statements dealt with the school names; at the same time, we removed any extraneous spaces and carriage returns before outputting it all to a temporary work file.
Generating user IDs and Passwords
The biggest problem here was with names. Since this is a French language school board, nearly all the students have French names. This made for lots of duplicate names (names like Séguin and Tremblay made up nearly 5% of the list by themselves). The board had previously used a naming standard of five characters from the last name and one character of the first name to generate the user name (for example my user ID would be: PattiI). A quick parsing of user list showed more than 4,000 duplicate names using this standard. We managed to convince them to add a second character from the first name (PattiIa), which reduced the duplicates to less than 2,000. Additional duplicates were handled by appending a number to the name. This further complicated things because we needed to be democratic, making sure that all duplicate users were handled equally - we couldn't have SeguiJa and SeguiJa1, they had to be SeguiJa1 and SeguiJa2. This was done with WHILE loops and IF statements, 15 layers deep, to make sure all the duplicates were caught.
The other issue with user IDs was accented characters - French names, French accents. While the accents are not valid in things like user IDs, it was necessary to preserve them for things like the full name field (which was used by NetMail, their e-mail package). In order to import this data via LDAP, they first had to be converted to UTF-8 format, and then they were Base64 encoded.
Passwords were a simple matter; a random 6-character alphanumeric string was generated for each user.
Bringing It All Together
The script finally generated three output files: an LDIF file for the tree, a batch file to create home directories and assign rights, and a CSV list of all users and password for the humans to use. To simplify matters further, the user list was ordered by school, then grade, then user last name.
Without Perl and LDAP it would have taken over 45 person-days of effort to manually create all the user accounts. As it was, it took 9 days to create the script, 15 seconds to run the script, and approximately an hour to import the data. Next year, when it comes time to create the user accounts again, it will take a maximum of one day to modify the script to handle any changes to their standards and create all the users again. Perl saved the day!
About the Author
Mr. Ian Pattison is an experienced Senior Consultant, based out of Toronto, Ontario Canada, with Vision NetWorks and may be reached at firstname.lastname@example.org.
Vision NetWorks (headquartered in Ottawa, Ontario Canada) is the leading provider of Consulting and Professional Services, and is a wholly owned subsidiary of 1470156 Ontario Inc.
Vision NetWorks management, staff and associates have been world leaders for over fifteen years in the implementation and integration of Novell, RIM, GWAVA, Omni and Messaging Architects and other premiere technology products. Visit them at http://www.vision-networks.ca.
Novell Cool Solutions (corporate web communities) are produced by WebWise Solutions. www.webwiseone.com