I finally got the opportunity to migrate an old NT style domain to Active Directory. In both cases the servers were running Samba. Let’s just say it was an experience I’d rather not repeat.
This particular project spanned 18 months due to scheduling issues (mostly just finding time where all three people were available and we could take the system down).
The Samba 3 domain was installed in 2007 as a direct replacement for an older Windows NT SBS server that was failing. This gave us freedom from licensing restrictions as well as more flexibility at the time. It was a single server sharing user profiles, user home directories, and most of the shared files.
Fast forward a few years and we have seen an expansion in the amount of data being collected. The old server has reached a functional limit and we need features only available via Active Directory.
So we picked up a couple servers so we could split the load. We wanted to put the user profiles and home directories on one machine and the shared files on another. During this time we also upgraded the primary network with support for 10G.
The first part of the migration went as smooth as could be expected. I had some clean up to do where users and groups needed to be removed. I did the initial testing in a virtual server and then repeated it live. This is where I found the Debian 9 was still using Samba 4.5 when 4.9 is current. So I forced an upgrade to 4.9 and away I went.
As part of the migration we made backups of each Windows desktop to a USB drive. This took awhile to get all 30 machines (all Windows 7) and we found a few issues along the way which added to the TODO list. There was a warning in the Samba classicupgrade process that this is a one-way process, hence the backups. We did discover that the desktops establish a connection to kerberos when they are connected to the network (before any user logon attempt) so I was glad we had the backups just in case. After the backups were done and we kicked everyone out of the building we got to work removing the Samba 3 specific changes on each desktop (registry, group policy, DNS, and LMHosts were touched). Backups were completed Thursday night as we could get ahead a bit due to people being on holidays.
Friday I isolated the new server after fetching the required files from the old server and began the migration. Of course I hit more errors and forgot a couple steps during this process. However, it was easy to start over at this point. I did have issues with permissions on the user profiles and user home directories but that was expected as that is not a recommended installation anymore. I made this mistake of not changing the server name in the smb.conf file before the upgrade and then wondered how the old name got into the network.
So I moved the second server into the new environment and started its configuration. I had to run a bunch of updates as I had not touched it for a while. Thankfully I do run a local mirror so downloading all 740 updates went very fast. However it was not all fun and games when it came to getting it to talk with the new AD-DC. I could see the server via smbclient. I could see the users and groups via wbinfo. I could only see local users via getent passwd. So after a few hours of trial and error we called it a day and agreed to talk later and schedule a start time for Saturday. I got home at 19:45 (started at 08:00) grabbed something to eat in front of the TV and got back to work at 21:00. Told the others not to start early as I was leaning more torwards a lunch time start than breakfast. I called it quits at 01:30.
I got up the next day at 06:00 and started working again. Needed a break so I went grocery shopping. Got back and went into the office for 11:00. I somehow made it work as the server can now see all the users and groups. Just what did it I do not know. I know I edited the smb.conf and krb5.conf files. It may have been a restart vs a reload that did it. I have yet to figure out which setting made it work though. However I was having issues with permissions again. After the other party calling in for a status update we decided to let him join us on Sunday and the two of us went ahead and finished the desktop shutdown steps (registry, group policy, DNS, and LMHosts) followed by disconnecting each machine from the network. After that I put the new servers back into the main network after disabling samba on the old server. After I got home I started syncing the user files and shared files to the new servers so we’d be ready for Sunday.
Sunday morning I started updating the new active directory to fill in the missing fields we needed (profile path and home folder mostly). At this point I started seeing permissions issues again as users were not able to create their profile folders when first logging in. We worked around the issue and started each desktop one at a time by plugging it in and logging in as that user ( I had changed each user account password to something we could remember. ) We had to remap network drives, change desktop shortcuts, and test each application to find out more changes were still needed. We found that ProMiles took almost 15 minutes to open on the new server. Considering the 10 years of processor and network improvements between the servers this was unexpected. The QNAP NAS units also presented issues even though they joined without errors, they would just not authenticate any users. We called it a day at 17:00 (started at 06:30). We did leave a sign on each desktop where we changed the user’s password. The sign had a headline of “You have been pwned.” followed by a Hello Kitty picture and “See Mike” all in large letters clearly readable from across the room. We did run into an issue where an old virtual machine running Windows 2000 had to unjoin the old domain and join the new active directory. We could not remember what the local administrator password was, so we left this one for next day along with a couple other servers.
Monday morning had us starting at 05:30 for a short day. I started earlier from home where I managed to get Promiles to work a normal speeds. I fixed up the accounting server (Windows Server 2008R2) so the accounting staff could get to work, I had to unjoin the old before it would work on the new. The old Citrix server (Windows Server 2003) had to have the same unjoin/join process. Called it quits around 11:00. Did nothing more that day leaving Citrix connected but not configured for users and the QNAPs not functional. Sometime in here I tried a user+password combination and managed to guess correctly on the old Windows 2000 virtual machine so I was able to complete the required changes.
Tuesday I spent 14 hours of frustration trying to get the QNAPs to work. They are behaving just like CHEECH was where wbinfo shows the users but getent does not. I ended up restoring the old Samba 3 LDAP settings on the QNAPs to get the business running again. This works so long as both systems have the same ID & Password. This allows me to test with a third QNAP to get the AD-DC to work as it is not used by the end-users.
After trolling the forums and searching via Google I have yet to get the QNAPs to work. I’ve gone backwards and forwards with firmware revisions, I’ve reinitialized the unit, made manual changes to smb.conf and krb5.conf to match the working server, compared settings changes between 4.4 and 4.9, all to no avail so far. I’ll post more on this adventure later.
Wednesday the rest of the office staff came in to work and not surprisingly many ignored the sign that was taped to their monitor and tried to access the system without talking to Mike. This ended up with them getting locked out. Due to the situation with the QNAPs we tried getting a user to change their password to match the one they had before the update. This is where we found that the user desktop required a complex password even though the Samba AD-DC had this turned off. We ended up getting the users to change their password through the domain management tools after we unlocked them.
Thankfully most things worked out-of-the-box for this upgrade but the QNAPs are still not fully functional and I still have to adapt the other servers that use LDAP lookups as well as “fix” the active directory for missing information. These will be discussed in later posts.