09 May 2022

Top 10 Pitfalls of an Microsoft Office 365 Migration


Photo by JESHOOTS.COM on Unsplash

Email is one of the resources in IT that people take most for granted. There are so many expectations attached to the email resource. It is expected to work… no … matter … what. More and more companies are finding that it is most cost-effective to migrate from on-prem self-hosted email solutions to cloud solutions. Unfortunately, once the decision is made to migrate, there is an unrealistic expectation that a systems administrator will wave a magic wand and 25 years of email boxes will suddenly be in the cloud, and end-users, least of all the C-Level, will be not be impacted.

There are lots of pros and cons to hosting email in the cloud versus self-hosting on-premises, but that is outside the scope of this post. I will say, my favorite email system I have gotten to use is still SquirrelMail while I was in college. It was certainly not very feature-packed, but it did give me an ssh login to the mail server. That server was one of the few things that was not rate-limited on the network… and ssh proxying was not blocked. Oh, the good-ole days! This also shows how much we have grown as a community. In researching this post, I saw old forums where people needed help migrating, and they were resorting to using each employee's outlook, essentially, to pull then push the mailbox from old to new.

Bil Keane 2016

Since then, I have been responsible for countless migrations: SquirrelMail to MS Exchange 2003, Exchange 2003 to Exchange 2007, Exchange to Gmail, Gmail to Office 365, Exchange to O365, Exchange 2016 to Hybrid O365, and plenty of derivatives in between. Pull up a chair, grab a cup of coffee, and take a deep breath. You can do this migration and you can avoid these top 10 pitfalls of migrating to Office 365!

10. Buy-Off Your Users (with Training)

If this was a “Top Two Pitfalls” post… they would both be End-Users. I have been asked several times prior to a migration “This will be seamless, right? No interruptions? Users won’t need to do anything?” The simple answer is NO. Sure, in the perfect world users are using a modern version of Outlook autodiscovery will switch them over, however, this is the time to have proper training rolled out for the new technologies, web portals, and policies that O365 will afford your company. Don’t take the easy out, don’t make promises you know you might not be able to keep. Yes, users will be impacted, but it is a good thing.

9. Poor Planning Produces Poor Results

There are many steps that go into a proper migration. A Minimal Hybrid Migration where you have coexistence between your on-premise and cloud tendencies is still an involved process:

Step 1: Verify you own the domain.
Step 2: Start express migration.
Step 3: Run directory synchronization to create users in Microsoft 365 or Office 365.
Step 4: Give Microsoft 365 or Office 365 licenses to your users.
Step 5: Start migrating user mailbox data.
Step 6: Update DNS records.

It is not as simple as changing an MX record and hitting a button. Take the time to do the research and then convey those expectations. Microsoft has a lot of resources to help you estimate the time it will take… double it.

Public Folders, Shared Mailboxes, and custom groups will all take extra time. If you can skip migrating Public Folders… do so. It is much preferred to start using modern solutions such as shared resources. Some people do not want to spend the time doing the migration and training involved in a new resource, but if you stick with Public Folders, remember that your only viable option for managing them is going to be PowerShell.

8. “Let it Go, Let it GO”

“Can’t we just apply the retention policies AFTER we migrate?”

I once worked with a large international media company. There were video editing departments, still graphics, animations, radio, etc. Many of these departments did not want to develop a standardized workflow and so used the email system for copy-approval. We are not talking proxy copies either… original full-resolution media. Needless to say, the company did not want to pay for more than the basic 50GB mailbox for most employees, and most of those mailboxes were much bigger than 50GB. You can get an idea of your larger mailboxes with the following PowerShell command:

[PS] C:\>Get-Mailbox -ResultSize Unlimited | Get-MailboxStatistics | Sort-Object TotalItemSize -Descending | Select-Object DisplayName,TotalItemSize -First 100 | Export-CSV top100mailboxes.csv

Even older versions of most email servers have a way to apply retention policies and most companies have a records management policy on how long things must be kept… and how long they should not be kept. This is a great time to make sure those are being implemented. Retention Policies are very straightforward but should be tested first following Microsoft Documentation

7. Be Exceptional! (But Don’t _Have_ Exceptions)

There is a place for granular policies and permission, and while I think those should always be applied to groups and then groups be applied to users, either way, that place is really not in the basic setup of a user mailbox. The more mailboxes are standardized, the easier it will be to diagnose problems.

I had a migration in the last few years for a company and a higher ranking manager was having terrible problems with his emails not being able to be opened by some board members. Turns out there were lots of exceptions made on his account, among which was allowing him to continue using Rich Text Format for everything. RTF is only readable by a few email clients. The excuse was that no one wanted to make him switch or modernize, etc. The result was that he was one of only a few people that actually had problems during the migration.

6. If you always do what you have always done… you might not get the same results…

This goes along with proper planning, but there are a particular number of unnecessary headaches caused by changing email systems and thereby changing spam rules. Spam rules are a necessary evil, but technology has gotten a lot better. Too often, companies just want to migrate their old rules into their new system. I would caution against this, in large part because newer systems, like the Baracuda and Proofpoint’s cloud offerings, or even Microsoft’s built-in rules rely more on artificial intelligence to filter out spam and bad actors, whereas old systems are generally more explicate.

5. Garbage In — Garbage Out

A migration to a new system will not wipe out a decade of bad practices. You can’t blame Twitter for not blocking you from posting that embarrassing tweet, it is no more magic than the import tools Microsoft offers. Moreover, if you are moving from an on-premise system to the cloud, you a going to have to pay for each of those mailboxes from past employees that were never cleaned out. The same goes for retention policies, litigation holds, weird routes, old groups, ancient contacts, and the disclaimer/company footers. Clean it up ahead of time, and use the momentum to push best practices going forward.

Fortunately for all of us, Microsoft includes some best practices scanners, there are also many great resources out there devoted to the topic.

4. Old dogs CAN learn new tricks (As long as they don’t require TLS)

Photo by Richard Bell on Unsplash

Often overlooked, but never forgotten by the people who use them, legacy clients, like copy machines, will need an SMTP connection. It is also necessary to evaluate any DMZ, CHD, or other isolated networks you may have that don’t have clear access to the Office 365 servers. If you use a TAP/SPAN port on your network, these are pretty easy to find using your network monitoring solution, otherwise, you can use Wireshark to monitor and log internal SMTP connections

The easiest solution is to make sure that your firewall and routers have access control lists and routes to the O365 servers, but it may also be necessary to run a small SMTP proxy server. I like running a small containerized appliance, but you can also use a simple Linux server.

3. Only YOU can stop security breaches

1989; Smokey Bear poster showing a half body image of Smokey pointing at the audience with one hand while holding a shovel in the other hand. Poster reads “Only You”. This work is maintained in the National Agricultural Library, in Beltsville, MD.

Too often, in the middle of a migration, I have heard the phrase “we will only open it up till we get this migration done.” This could be the firewall, the SSL requirements, or giving more permissions to a service account than necessary. Just say NO. Always get things working the right way and don’t cut corners. Exchange servers are in no way immune to malicious attacks and when your focus is on the migration, it is easy to let your guard down keeping the on-premise system safe. Make sure you are keeping up on your daily/weekly/monthly security tasks:

  1. Keep Exchange servers up to date
  2. Maintain Firewalls
  3. Keep security appliances and software up to date and keep checking logs(Symantec, Barracuda, Proofpoint, etc)
  4. Secure network hosting Exchange
  5. Monitor server logs
  6. Use certificates for ALL external services
  7. Limit administrative access and elevated accounts (including service accounts)
  8. Enable role-based access control and require strong passwords
  9. Audit admin and other mailbox activities
  10. Use Microsoft’s security Utilities (Safety Scanner, Defender, Security Configuration Wizard, Security Compliance Toolkit, Exchange Analyzer)

2. But it was going so well… (Plan B)

As we go back to number 9, I have to reiterate that having a good plan is necessary. There is nothing worse than doing an overnight migration and finding out at 7 am that the system is hosed and you have to figure out how to get the email back up and running. Each step needs to be thought out. One thing we did at a previous place was to set up a temporary email portal for checking incoming emails by using the portal feature on our enterprise spam filter.

So often the problem is just an internet interruption in the middle of the migration, sometimes a firewall rule was changed, or your on-premise public IP may have changed. In these cases, you can just restart the batch that failed. Very rarely have I had to resort to pulling a PST (or a backup) of that user's mailbox and pushing it to the new system.

1. Buy-Off Users… again

https://poweroutage.report/california

Finally, as I said before, email is one of the technologies that is most taken advantage of; people just expect it to work. The electric grid, gas companies, and telco do not have 100% uptime, email doesn’t either. While I pushed end-user training in number 10, this is also a great time to get better buy-in and understanding from management. There are so many options and solutions in modern email services. Make sure you are using those things that are best for your company. This migration can be as much about moving to better practices and better technologies as it is moving to a cheaper platform in the cloud.

Good Luck! If this is your first time doing an email migration or your fiftieth time, let me know how it went. Reach out on Twitter and let me know if you learned anything new this time around. 

05 April 2022

Shifting to Better Secrets in Kubernetes

Who hides behind a mask (secret, secret; I’ve got a secret)
So no one else can see (secret, secret; I’ve got a secret)
My true identity!
- “Mr. Roboto” by Styx

Happy Golden Years

Twenty years ago, you would be reading this article on myspace or xanga with a beautiful midi rendition of “Mr. Roboto” by Styx playing on loop. The 90's and early 00's were a great time for IT. We had Napster, AOL Instant Messenger, and a guy named Geoff running your corporate webpage, intranet, phone system, and tech support for ClarisWorks.

Today Geoff is a CTO with five or six departments handling the various aspects of those same needs he took care of just twenty short years ago. Certainly, processes have changed over the years, because the product both in and the product out has changed. We are better at securing data; all of our (plain text) secrets are no longer living next door in /var/www/.htaccess. We know we need to keep track of certificates, tokens, and database access keys. None of us want our companies showing up in the headlines as the next big data breach. Arguably, however, there are some habits and attitudes that are still lingering from the early days that still exist because “we are just waiting for Earl to retire” or “it would cost far too much to ask the devs to retool everything.” Stick around, we will discuss those attitudes and look at ways to start the paradigm shift to better practices, repeatable methods, and a better night’s sleep.

The Perfect World

WAAC — SILENCE MEANS SECURITY, U.S. National Archives and Records Administration

There are many reasons to move secrets to a different abstraction layer, and in the perfect world, those are utilized by our applications, natively. Namely, in Kubernetes, that means containerized apps can utilize injected passwords, tokens, certificates, etc from variables and files from a secrets object. These secret objects should then be managed by a specific team or by a secrets management system like Google Secret Manager, Azure Key Vault, AWS Key Management Service, or HashiCorp Vault. Role-based access control (RBAC) should then be applied to the secret objects so that only those people directly responsible have access to list, edit, create, etc.

Security

This may seem like the obvious one, but obfuscating and abstracting secrets out of the normal operating and development space is the easiest way to also keep them out of consumer space, genius, I know. What about all those repositories we use to collaborate and apply version control? It is a lot easier to keep tokens and passwords from being accidentally pushed to the wrong place if they were never in the code in the first place. In a similar way, it allows companies to collaborate with consultants and outside developers without giving away the secret sauce.

Simplification

Abstraction of any sort can vastly simplify code. If we look at the life cycle of an application as it moves from local development through staging and QA finally into the production environment, it is important that the code at each level does-not-change. If a human being is manually changing connections, passwords, etc at each stage you are asking for a late-night chasing a simple spelling error, extraneous tab, or an extra semicolon. Kubernetes namespaces allow for the same secret to be defined in each environment, simple.

Accessibility

In contrast to Security, this is the least obvious benefit. However, anyone who has walked into a new project and found that the only documentation is the code will understand the need for consistency and clarity. This is also true for contracting consultants and developers. Those same people whose hands we need to keep out of the cookie jar also need to have access to build, test, and deploy with appropriate resources.

All these things are possible by implementing cloud-native apps that ingest secrets from the Kubernetes cluster as default, from the beginning. The perfect conditions do not, however, usually exist without a lot of forethought and planning.

A Note of Caution

Typical Panic Bar, Jonathan Clemens

Secrets management in Kubernetes is not perfect and should still be handled with care. Encoding something that is easily decoded, and leaving it in a public space, is a bad practice. Encoding is not the same as encrypting. Encoding simply translates a key, code, or certificate into something that can be decoded on the other side without sharing a decryption key. In Kubernetes, a base64 encoded secret can be decrypted by anyone who has access to get those secrets. Use RBAC and remove the access.

Everyone should be familiar with the idea of the crash bar, panic bar, egress bar, etc. As we will see in a moment, it is much like the latch on old refrigerators, its purpose is to allow egress when needed while returning to a locked state for ingress, without user intervention. Most of these doors need to remain closed so that the connected alarm systems do not set off a “held open” alarm. To avoid this, and to facilitate those times when the door does need to be unlocked, most of them have a way to “dog down” the latch. In most cases, you push down the bar, insert and twist a dogging key, and the door stays unlocked. Ideally, this system works well to keep the property safe while allowing exceptions. The problem is that the key, which is most often some sort of hex key, can never be found when people need it, and when they need it, they have to have it. That dogging key is placed, all too often, on the door jam right above the door it can dog down. All too often, this results in an unlocked door, in a space that is supposed to remain secured.

This is an excellent example of obfuscation without abstraction.

The Key Superglued in the Fridge Door

Nevera Philco, año 1950, Museo del Pueblo de Asturias, 01

Old refrigerators had latches that sealed shut when the door closed. It was simply a solution that worked with the technology available. Even G.I. Joes had a PSA about the dangers of climbing inside and suffocating. Now, refrigerators have magnetic strips in the gasket, so it is easy to get out. The interim fix was to remove the latch, but add a key lock to the door of the fridge/freezer to keep the door shut, but not enable someone to lock themselves in. The key was seen as a needed item to control access and for protection, but it was not so big of a deal that anyone was worried about having the Sunday roast stolen. I am told it was not uncommon to glue the key into the lock.

When we started building infrastructure and rolling out usable applications/web pages thirty years ago we worked with what he had. There were no mainstream key management systems. Fast forward twenty years and some developers are still hard coding keys and database access into version-controlled code. Some will argue that they are the only ones with access and that nothing is exposed. How many times have we all heard the argument that switching would be expensive and incredible time-intensive? While one of the benefits listed above has been solved, accessibility, it has opened an opportunity for a security incident. Moreover, it forces the developer to change keys and codes as the applications move out of development towards staging, QA, and production. Worse, they sometimes use the same key for all their environments and occasionally contaminate databases with the wrong data.

The Compromise

Business constraints exist. While it would be great to go through and fix everything all at once, there are always other projects that will take precedence over cleaning up systems that already exist in production.

I did promise advice.

The Unmovable Developer

My boss, James Hunt, was all too happy to play the part of the stubborn, too busy to breathe, unwilling to re-code developer. In all reality I think he wrote this as a warm-up to the day with his first cup of coffee. His single-page web app called O-Fortuna was simple enough. Run the pod and visit the webpage to see random sayings. The kicker here was that we wanted to be able to change the contents of the file that contained the different fortunes, and we were forbidden to alter the container itself. The fotunes.js config file looks like this:

module.exports = [
'There is no cow on the ice',
'Pretend to be an Englishman',
'Not my circus, not my monkeys',
'God gives nuts to the man with no teeth',
'A camel cannot see its own hump',
]

The Starting Line

For clarity’s sake, I have included the original deployment. Feel free to visit the repository and dig more into the Dockerfile and Docker run command to see where this manifest comes from. For all intents and purposes, we simply created a pod (in a Deployment) to run the o-fortuna image.

apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: file-injecternitiator
name: file-injecternitiator
namespace: fortuna
spec:
replicas: 1
selector:
matchLabels:
app: file-injecternitiator
template:
metadata:
labels:
app: file-injecternitiator
spec:
containers:
- name: fortuna
image: filefrog/o-fortuna:latest
command: ["/bin/sh", "-c"]
args: ["node index.js"]
ports:
- containerPort: 3000

Works as intended.

Possible Options

At this point, there are several options to get the new information into the pod.

  1. We could make a configMap to replace the fortunes.js, and for this example that would be… fine. However, as we are looking at these fortunes as being secrets, we would have neither obfuscation nor abstraction… fail
  2. We could turn the new fortunes.js file into a Kubernetes secret object by base64 encoding the entire thing. We would have obfuscation and abstraction, but we would have not accomplished making the deployment easier to edit, easier for a contractor to work on, simpler to push through development stages, nor would we have easy repeatability… fail
  3. We create an init container to pull down the fortunes.js file from the git repository and inject replacement variables into the file with sed commands before copying it over to the running container? While this may work for this specific instance, it is not repeatable and it is clunky.

Putting it All Together

Each of those options has good things going on for them, and if we combine them all, we get something useful. Reviewing our requirements, we want something that can be applied to a variety of applications to make use of Kubernetes Secrets Objects and 3rd-Party Secret Management. We want minimal impact on the existing project, with predictable results even in complex scenarios. We want to increase security, simplicity, and accessibility, and get our teams closer to best practices.

The configMap

The first step is to know WHAT we want to change and to prep that file. We know the format of the fortunes.js file, and we can prepare a version that has placeholders for future data. That file can be injected with a configMap.

apiVersion: v1
kind: ConfigMap
metadata:
name: old-file
namespace: fortuna
data:
old.file: |
module.exports = [
'${FORTUNE_1}',
'${FORTUNE_2}',
'${FORTUNE_3}',
'${FORTUNE_4}',
'${FORTUNE_5}',
]

The placeholders look like environmental variables, but neither the Node application nor Kubernetes will treat them that way… yet.

The Secrets

Next comes our new fortunes, ideally from a manager or team member that handles the encryption keys, etc. for the environment. I have taken the opportunity to base64 encode these ahead of time, but these could also be inserted as stringData.

apiVersion: v1
kind: Secret
metadata:
name: key-token-new
namespace: fortuna
data:
FORTUNE_1: VGFsa2luZyBhYm91dCB0aGUgd29sZg==
FORTUNE_2: VG8gc2hvdyB3aGVyZSB0aGUgY3JheWZpc2ggaXMgd2ludGVyaW5n
FORTUNE_3: QSB3aGl0ZSBjcm93
FORTUNE_4: VG8gcGxhbnQgYSBwaWcgb24gc29tZW9uZQ==
FORTUNE_5: 0JrRg9C/0LjRgtC4INC60L7RgtCwINCyINC80ZbRiNC60YM=

The Switcheroo

Variable-swap is a fairly self-explanatory image. It uses the envsubst command from the gettext package to replace all strings “annotated like variables” with the environmental variables passed into the container. Envsubst has the advantage over sed of processing the entire file at once, instead of once for each variable, and does not have to be changed for each deployment. My git repository has a great standalone example of getting the image running on Docker Desktop, or if you want to roll your own:

FROM ubuntu
RUN apt-get update \
&& apt-get -y install gettext-base \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
ENTRYPOINT ["/bin/sh", "-c", "envsubst < $OLD_FILE > $NEW_FILE"]

An important distinction needs to be made here. Neither Kubernetes nor Node.js saw the contents of the new fortunes.jsfile as having variables. envsubst looks for anything that looks like a variable, and then if there is an environmental variable that matches, it will then do the substitution.

The Deployment

The final deployment can be seen put together in the file-injecternitiator git repository. The first thing to notice is the minimal change to the original container. We added a new file and copied that file to its final location.

apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: file-injecternitiator
name: file-injecternitiator
namespace: fortuna
spec:
replicas: 1
selector:
matchLabels:
app: file-injecternitiator
template:
metadata:
labels:
app: file-injecternitiator
spec:
containers:
- name: fortuna
image: filefrog/o-fortuna:latest
command: ["/bin/sh", "-c"]
args: ["cp /new_file/new.file /fortunes.js && node index.js"]
ports:
- containerPort: 3000
volumeMounts:
- name: injected-secret-volume
mountPath: /new_file
initContainers:
- name: variable-swap
image: tomvoboril/variable-swap
env:
- name: OLD_FILE
value: /old.file
- name: NEW_FILE
value: /new_file/new.file
envFrom:
- secretRef:
name: key-token-new
volumeMounts:
- name: injected-secret-volume
mountPath: /new_file
- name: old-file
mountPath: /old.file
subPath: old.file
volumes:
- name: injected-secret-volume
emptyDir: {}
- name: old-file
configMap:
name: old-file

Next, we can see the implementation of the variable-swap image as an init container. We can see our secret object from above injected as environmental variables. An initContainer must complete before the other containers can start. It can be difficult to diagnose initContainers if you do not use the — all-containers flag when looking at logs.

We did not use an initContainer that had kubectl installed and a service account to inject the secrets directly into the run container from the initContainer. That would have increased complexity without increasing security. Arguably, that would have decreased security by possibly exposing the variables to the node host with process snooping.

We also used shared emptyDir: {} instead of a hostPath or persistent volume. This is, again, an attempt to simplify the cleanup of data and not expose secrets any more than necessary.

    initContainers:
- name: variable-swap
image: tomvoboril/variable-swap
env:
- name: OLD_FILE
value: /old.file
- name: NEW_FILE
value: /new_file/new.file
envFrom:
- secretRef:
name: key-token-new
volumeMounts:
- name: injected-secret-volume
mountPath: /new_file
- name: old-file
mountPath: /old.file
subPath: old.file

Finally, we can see where our “old” file is injected with a config map in the volumes section of the yaml file.

    volumes:
- name: injected-secret-volume
emptyDir: {}
- name: old-file
configMap:
name: old-file

New fortunes!

Everything Running Beautifully.

Thoughts and Considerations

⎷ More Secure?
⎷ Simple to modify and deploy changes?
⎷ More Accessible to other Developers?

Photo by Mohammad Rahmani on Unsplash

This is also not the place to STOP improving. Sometimes making a big step is not feasible because of time/budget/infrastructure/personalities/internal politics, but those should not be the reason to avoid taking small steps. We would love to see all our clients have perfect environments that are free from hurdles, foibles, and holes, but sometimes it is more important to start building better habits than it is to make a single big change that doesn’t stick and doesn’t permeate through the company’s culture and processes.