Showing posts with label e-mail. Show all posts
Showing posts with label e-mail. Show all posts

2012-10-24

gmail_imap.py

This is the source code of gmail_imap.py for quickly importing your GMail mail to Outlook Exchange with all your labels.  Open an editor, such as notepad and copy and paste the code below, then save as gmail_imap.py.


# Copyright 2012 Lajos Molnar.
# Licensed under the Creative Commons Attribution-ShareAlike 3.0 license
# See http://creativecommons.org/licenses/by-sa/3.0/

import imaplib, sys, hashlib, base64, pickle, os, getpass, argparse

p = argparse.ArgumentParser('gmail_imap.py', description='helper script to migrate GMail labels to Outlook Categories')
p.add_argument('operation', choices=('import', 'purge', 'apply'))
p.add_argument('-f', '--file', help='db file to import', default='gmail_imported.pickle')
p.add_argument('-r', '--reimport', help='reimport already imported labels', action='store_true')
p.add_argument('-e', '--email', help='gmail account name')
p.add_argument('-p', '--password', '--pwd', help='gmail account name')
p.add_argument('-I', '--inbox', help='label used instead of Inbox', default='Inbox')
p.add_argument('-T', '--trash', help='label used instead of Trash', default='Trash')
p.add_argument('--import-trash', help='also import/purge trash', action='store_true')
p.add_argument('--final', help='purge imported messages', action='store_true')
p.add_argument('--folder', help='Outlook folder where mail was imported to')
p.add_argument('--limit', type=int, help='Maximum number of items to purge at one time')
p.add_argument('--recurse', help='apply labels recursively for messages in all subfolders', action='store_true')
a = p.parse_args()

all_mail, trash = '[Gmail]/All Mail', '[Gmail]/Trash'
# read any existing state
LABELS, labels_done = {}, set()
try:
    with open(a.file, 'rb') as f:
        LABELS = pickle.load(f)
        labels_done = set(v for k, v in LABELS.items() if k != 'MAIL')
except:
    pass

def hashOf(msg):
    m = hashlib.sha1()
    m.update(msg.encode('utf-16')[2:])
    return base64.b64encode(m.digest()).decode('ascii')

if a.operation == 'apply':
    from win32com.client import Dispatch
    print('connecting to Outlook...')
    O = Dispatch('Outlook.Application')
    print('browsing to folder', a.folder, O)
    F = MAPI = O.GetNamespace('MAPI')
    for f in a.folder.split('\\'):
        fs = [f.Name for f in F.Folders]
        F = F.Folders(f)
    print('Cataloging mail...', F.StoreID)
    MAIL = {}
    if a.recurse:
        special = ()
        def walk(F):
            yield F
            for f in F.Folders:
                for f_ in walk(f):
                    yield f_
        folders = list(walk(F))
    else:
        special = ('Inbox', 'Trash')
        folders = [f.Name for f in F.Folders]
        for f in special:
            if f not in folders:
                F.Folders.Add(f)
        folders = [F] + [f for f in F.Folders if f.Name in special]

    for f in folders:
        def cats(i):
            c = set()
            if i.Parent.Name == special:
                c.add(i.Parent.Name)
            if i.Categories:
                c |= set(i.Categories.split(', '))
            return c

        N = f.Items.Count
        print('Applying labels in', f.Name, 'for', N, 'items ...')
        for ix, i in enumerate(f.Items, 1):
            PR_TRANSPORT_MESSAGE_HEADER = "http://schemas.microsoft.com/mapi/proptag/0x007D001E"
            msg = i.PropertyAccessor.GetProperty(PR_TRANSPORT_MESSAGE_HEADER)
            h = hashOf(msg)
            c = set()
            if h in MAIL:
                if (i.EntryID, f.StoreID) != MAIL[h]:
                    # already imported, combine labels
                    old = MAPI.GetItemFromID(*MAIL[h])
                    c = cats(old)
                    if old.UnRead:
                        i.UnRead = True
                    print('already imported', h, 'with', c)
                    print('adding to categories', cats(i))
                    old.Delete()
            elif h in LABELS:
                c = set(c.replace('[Gmail]/', '') for c in LABELS[h] if c != all_mail)
            c -= cats(i)
            if c:
                for cat in c:
                    # apply categories
                    if cat == 'Important':
                        i.Importance = 2
                    elif cat in special and i.Parent.Name != cat:
                        i = i.Move(F.Folders(cat))
                    elif not i.Categories:
                        i.Categories = cat
                    else:
                        i.Categories = i.Categories + ', ' + cat
                i.Save()

            try:
                del LABELS[h]
            except:
                pass
            MAIL[h] = (i.EntryID, f.StoreID)
            if ix % 100 == 0:
                print("{:%}".format(ix / N), end='\r', file=sys.stderr)
                sys.stderr.flush()

    # also save all imported hash
    if LABELS:
        print('WARNING: Could not apply labels for', len(LABELS), 'messages. ')
        print('Please reapply the created', a.file + '.remaining', 'database file later.')
    LABELS['MAIL'] = list(MAIL.keys())

    with open(a.file + '.remaining', 'wb') as f:
        pickle.dump(LABELS, f)
else:
    if a.operation == 'purge' and a.final and 'MAIL' not in LABELS:
        print('For final merge, please specify .remaining file from apply')
        sys.exit(1)

    # connect to IMAP
    M = imaplib.IMAP4_SSL("imap.gmail.com")
    print('logging in...')
    M.login(a.email or getpass.getpass('Email:'), a.password or getpass.getpass())
    
    print('querying labels...')
    typ, data = M.list()
    assert typ == 'OK', typ
    labels = [ eval(d.partition(b') "')[2].partition(b' ')[2], None, None) for d in data ]
    print('found', len(labels), 'labels')
    total_found = 0
    # since all_mail label is critical, verify that it is correct
    assert all_mail in labels or any(l.lower() == all_mail.lower() for l in labels if l != all_mail), "{} not in IMAP folder list".format(all_mail)

    def group(nums, limit=None):
        if limit == None:
            limit = len(nums)
        while nums:
            num_s = num_e = min(nums)
            while num_e + 1 in nums and num_e + 1 < num_s + limit:
                num_e += 1
            yield str(num_s) if num_s == num_e else "%d:%d" % (num_s, num_e)
            nums -= set(range(num_s, num_e + 1))

    def process(M, l, a, nums, LABELS, limit=None, trash=False):
        if l == a.inbox:
            l = 'Inbox'
        elif l == a.trash:
            l = trash

        if limit == None:
            limit = len(nums)
        for num in group(nums, 50):
            print(num, end='\r', file=sys.stderr)
            sys.stderr.flush()

            deleted = set()

            typ, data = M.fetch(num, '(BODY.PEEK[HEADER])')
            assert typ == 'OK', typ
            for d in data:
                if type(d) == type((1,2)):
                    h = hashOf(d[1].decode('ascii'))
                    if a.operation == 'import':
                        try:
                            if l not in LABELS[h]:
                                LABELS[h].append(l)
                        except:
                            LABELS[h] = [l]
                    elif a.operation == 'purge':
                        num = eval(d[0].partition(b' ')[0], None, None)
                        try:
                            if l in LABELS[h]:
                                deleted.add(num)
                                if l == all_mail:
                                    try:
                                        LABELS['TRASH'].append(h)
                                    except:
                                        LABELS['TRASH'] = [h]                                    
                        except:
                            pass

            if a.operation == 'purge':
                limit -= len(deleted)
                for num in group(deleted):
                    print("deleting", num)
                    if trash:
                        M.store(num, '+X-GM-LABELS', '\\Trash')
                    else:
                        M.store(num, '+FLAGS.SILENT', '\\Deleted')
                if limit < 0:
                    break

        if a.operation == 'import':
            print('done', 'total', len(LABELS), 'mail')
            with open(a.file + '.new', 'wb') as f:
                pickle.dump(LABELS, f)
            if sys.platform == 'win32':
                os.unlink(a.file)
            os.rename(a.file + '.new', a.file)
        elif a.operation == 'purge':
            print('expunging...')
            M.expunge()

        M.close()

    def get_nums(M, l):
        print('cataloguing label', l, end='... ')            os.unlink(a.file)

        sys.stdout.flush()
        typ, data = M.select('"' + l + '"')
        if typ == 'NO':
            return set()

        typ, data = M.search('', 'ALL')
        assert typ == 'OK', typ
        nums = set(map(int, data[0].split()))
        print('has', len(nums), 'messages')
        return nums

    for l in labels:
        # don't reimport existing labels
        if a.operation == 'import' and l in labels_done and not a.reimport:
            continue
        elif a.operation == 'purge' and l == all_mail:
            continue

        nums = get_nums(M, l)
        if nums and (l != trash or (a.import_trash and not a.final)):
            total_found += len(nums)
            process(M, l, a, nums, LABELS)
    if a.operation =='purge' and a.final and total_found == 0:
        print('purging imported mail items (total', len(LABELS['MAIL']), ')')
        while True:
            nums = get_nums(M, all_mail)
            if not nums:
                break
            LABELS2 = dict((i, [all_mail]) for i in LABELS['MAIL'])
            process(M, all_mail, a, nums, LABELS2, limit=a.limit, trash=True)
            if not a.limit:
                break

        # remove deleted messages from Trash
        if 'TRASH' in LABELS2:
            while True:
                nums = get_nums(M, trash)
                if not nums:
                    break
                process(M, trash, a, nums, dict((i, [trash]) for i in LABELS2['TRASH']), limit=a.limit)
                if not a.limit:
                    break

    M.logout()
  

2012-10-07

Migrating GMail to Exchange (part 4) - Doing the migration

Now that you have set up your e-mail server and client, and added the conversion macro, you are ready to start the migration.  The conversion process consists of the following steps:
  1. Download messages from GMail/IMAP server
  2. Copy messages and/or labels to the target folder.  This (locally) deletes converted emails from the IMAP server.
  3. Purge deleted messages from the IMAP server
If you have limited the number of messages in the IMAP folders, you will need to repeat these steps first until all labeled messages have been converted, then again for all unlabeled messages.

If you have not limited the number of messages in the IMAP folders, it still makes sense to do these steps for all labeled messages first (including purge) to ensure that all labeled message indeed have been  migrated.  Migrating unlabeled messages will remove any remaining labels on those messages.  (Actually, it will move them to Trash.  You can recover your messages from Trash to recover any remaining labeled versions of the messages.)  Once you verified that all labeled messages have been migrated, migrate the unlabeled messages.

Let's look at each step in detail:

1. Download IMAP messages

To download messages from IMAP, you first need to subscribe to the folders that you want to download.  Do this by, right-clicking the GMail folder, and selecting Update Folder List.  Once complete, right-click again and select IMAP Folders...


If you are migrating labeled messages, subscribe to all folders except [GMail]/All Mail.  This will help keep your Outlook cache file smaller and the migration faster.  At the end, subscribe to [GMail]/All Mail as well. 

To subscribe to the folders, first select Query. This will list all IMAP folders.


Then select the desired folders you want to subscribe to, and click Subscribe.  Subscribed folders will have a folder mark next to them.  Once satisfied, click OK.


Click Send / Receive => Send/Receive All Folders (or simply press F9) to download IMAP messages in the subscribed folders.  This can take a while.

2. Run the conversion script

There are two ways to run the messages.
  1. From the Outlook window, you can press Ctrl+F8 to bring up the Macros menu.
  2. Alternately, you can run the macro from the Microsoft Visual Basic for Application window (that you started earlier with Alt+F11).  Here you can follow the progress of the conversion in the Immediate window.  To do this, Press Ctrl+G.  You can bring up the Macros menu by selecting the top line and clicking F5.  If you are not on the top of the script, there is a danger that the cursor is inside a runnable script, which will start without confirmation.


If there are still messages with labels, select import_all_labeled_messages and Run.  Once you have migrated all labeled messages from GMail (not just the ones downloaded, but positively all labeled messages), you can run the second macro import_finally_all_mail.

You should not get any error messages, unless your mailbox gets full, or the IMAP connection is lost.  If this happens, you can restart the conversion script and the conversion will continue.  You will need to increase the PST size limit before continuing if the mailbox size limit has been reached.  This setting will not take place until Outlook is restarted.

3. Purge deleted messages

Once the migrated messages have been deleted from the respective "label" folders, you also need to communicate this to the GMail server.  You only need to do this if you elected to optimize the migration by doing cached deletion (e.g. set auto-expunge OFF, and mark items for deletion) prior.

To purge the deleted messages, select a folder on the IMAP server, click Folder => Purge => Purge All Messages for account.



This will take a while, but will enable you to download further messages and do the next round of migration.

NOTE: you will not be able to purge your messages in the "[Gmail]/All Mail" folder due to GMail's handling of deleting a message.  Instead, you will have to move all of the deleted messages (these are marked by being crossed and greyed out) to the "[Gmail]/Trash" folder.  This will remove them from the All Mail folder. You can then empty the trash in GMail.

Verification

Once you have migrated all labeled messages, verify on the GMail server, that in fact, there are no conversations under any labels (you can do this under Settings => Labels), as well as in your Inbox, Starred messages, Important, Sent Mail, Drafts, Spam, and Trash.  This means that all labeled messages have been successfully migrated.

To complete the migration, you will also need to migrate any remaining mail without any labels.  You need to make sure you are subscribed to the [Gmail]/All Mail folder, and repeat the conversion step, this time running import_finally_all_mail macro.  This will move all mail into the Trash on the IMAP server.  After this migration step, you should only have your Chat history in your All Mail folder.

So what can you do with the remaining data on your GMail account?  Find out next...

Migrating GMail to Exchange (part 3) - Conversion VBA Macro

Once you have configured your Outlook client, the next step is to add the conversion macro to Outlook.

1. Enter Macro editor

Press Alt+F11 to start the Macro editor in Outlook.  A security message may appear to indicate that you are enabling macros.



Select Enable Macros to proceed.

2. Open Outlook.VBA script:

In the Project tool-window, click on Project1 (VbaProject.OTM) => Microsoft Outlook Objects to access ThisOutlookSession.  Double click on ThisOutlookSession to open the script's edit window.


3. Paste macro

Copy and paste the following script into your global Outlook.VBA script (the edit window):

' Copyright 2012 Lajos Molnar except ToBase64String method, which is marked below.
' Licensed under the Creative Commons Attribution-ShareAlike 3.0 license
' See http://creativecommons.org/licenses/by-sa/3.0/

Option Explicit

Const IMAP = "you@domain.com"       ' name of outlook root folder where GMail account is read via IMAP
Const PST = "migrated"              ' name of outlook root folder where mail should be imported to
Const cache_folder = ""    ' name of subfolder inside PST where GMail folders are copied to (or empty)
Const trash_label = "T"    ' alternate name of Trash folder (or empty)

Const date_range = ""               ' date range to import (or empty)

#Const USE_BODY = 0                 ' set to 1 to use whole message body

' =============== HASHING ===============
Function ToBase64String(rabyt)
  'Ref: http://stackoverflow.com/questions/1118947/converting-binary-file-to-base64-string
  With CreateObject("MSXML2.DOMDocument")
    .LoadXML ""
    .DocumentElement.DataType = "bin.base64"
    .DocumentElement.nodeTypedValue = rabyt
    ToBase64String = Replace(.DocumentElement.text, vbLf, "")
  End With
End Function

Function getHash(ByRef sha1, ByRef strToHash As String) As String
    Dim inBytes() As Byte, shaBytes() As Byte, b, b2 As Byte
    Dim r As String
    inBytes() = strToHash
    shaBytes() = sha1.ComputeHash_2(inBytes)
    getHash = ToBase64String(shaBytes)
End Function

Function hashOf(ByRef sha1, i) As String
    Const PR_TRANSPORT_MESSAGE_HEADERS = "http://schemas.microsoft.com/mapi/proptag/0x007D001E"
    Dim olkPA As Outlook.PropertyAccessor
    Set olkPA = i.PropertyAccessor
    Dim body As String
    body = olkPA.GetProperty(PR_TRANSPORT_MESSAGE_HEADERS)
#If USE_BODY Then
    body = body + i.body
    If TypeName(i) = "MailItem" Then body = body + i.HTMLBody
#End If
    hashOf = getHash(sha1, body)
    Set olkPA = Nothing
End Function

Public Sub import_finally_all_mail()
    do_import True
End Sub

Public Sub import_all_labeled_messages()
    do_import False
End Sub

Private Sub do_import(final_import As Boolean)
    Dim dDone As New Scripting.Dictionary
    Dim dIMAP As New Scripting.Dictionary

    Dim sha1 As SHA1CryptoServiceProvider
    Debug.Print "Creating SHA1 service provider..."
    Set sha1 = New SHA1CryptoServiceProvider

    Dim MAPI As Outlook.NameSpace
    Set MAPI = ThisOutlookSession.GetNamespace("MAPI")

    Dim imap_folder As Outlook.MAPIFolder, target As Outlook.MAPIFolder, f As Variant
    Set imap_folder = MAPI.Folders(IMAP)
    Set target = MAPI.Folders(PST)

    Dim i As Variant, i2 As Variant
    Dim items As Outlook.items

    ' catalog imported messages
    Debug.Print "Cataloguing already imported messages... ";

    On Error Resume Next
    target.Folders.Add "Inbox"
    target.Folders.Add "Trash"
    On Error GoTo 0

    Dim target_list, c, cat As String
    target_list = Array(target, target.Folders("Inbox"), target.Folders("Trash"))

    For Each f In target_list
        Debug.Print "("; f.Name; ": "; f.items.Count; "messages) ";
        For Each i In f.items
            Dim h As String
            h = hashOf(sha1, i)

            If dDone.Exists(h) Then
                Debug.Print "***DUPLICATE***"; i.Parent.Name; ":"; i.Subject; "and"; dDone(h).Parent.Name; ":"; dDone(h).Subject
                ' Remove one of the duplicates - combine categories
                If i.Parent.Name = PST Then
                    Set i2 = MAPI.GetItemFromID(dDone(h), target.StoreID)
                    For Each c In Split(i.Categories, "; ")
                        cat = c
                        Set i2 = add_category(i2, cat, target)
                    Next c
                    dDone(h) = i2
                    i.Delete
                Else
                    For Each c In Split(dDone(h).Categories, "; ")
                        cat = c
                        Set i = add_category(i, cat, target)
                    Next c
                    MAPI.GetItemFromID(i.EntryID, target.StoreID).Delete
                    dDone(h) = i.EntryID
                End If
            Else
                Debug.Assert i = MAPI.GetItemFromID(i.EntryID, target.StoreID)
                dDone.Add h, i.EntryID
            End If
            If dDone.Count Mod 1000 = 0 Then Debug.Print dDone.Count; " ";
        Next
    Next
    Debug.Print "done"

    If cache_folder <> "" Then
        import_labels target.Folders(cache_folder), "", dDone, dIMAP, sha1, target, False
    End If

    If final_import Then
        import_labels imap_folder.Folders("[Gmail]").Folders("All Mail"), "", dDone, dIMAP, sha1, target, True
    Else
        import_labels imap_folder, "", dDone, dIMAP, sha1, target, True
    End If
End Sub

Private Function add_category(item, ByVal category As String, target As Outlook.MAPIFolder)
    Set add_category = item
    If category = "" Then Exit Function

    If Left(category, 8) = "[Gmail]/" Then category = Mid(category, 9)
    If category = trash_label Then category = "Trash"

    ' Handle Important separately
    If category = "Important" Then
        item.Importance = olImportanceHigh
        'item.Save
    ElseIf category = "Inbox" Or category = "Trash" Then
        If item.Parent.Name <> category Then
            Set add_category = item.Move(target.Folders(category))
        End If
    ElseIf item.Categories = "" Then
        item.Categories = category
        'item.Save
    ElseIf InStr(", " + item.Categories + ", ", ", " + category + ", ") = 0 Then
        item.Categories = item.Categories + ", " + category
        'item.Save
    End If
End Function

Private Function import_mailitem(i, label As String, _
        dDone As Scripting.Dictionary, dIMAP As Scripting.Dictionary, _
        sha1, target As Outlook.MAPIFolder) As Boolean
    Dim MAPI As Outlook.NameSpace
    Set MAPI = ThisOutlookSession.GetNamespace("MAPI")

    If TypeName(i) = "MailItem" Or TypeName(i) = "AppointmentItem" Or TypeName(i) = "MeetingItem" Then
        Dim i2, d As String

        If dIMAP.Exists(i.EntryID) Then
            d = dIMAP(i.EntryID)
        Else
            d = hashOf(sha1, i)
        End If

        If dDone.Exists(d) Then
            Set i2 = MAPI.GetItemFromID(dDone(d), target.StoreID)
            Debug.Assert i2.Subject = i.Subject
            Debug.Print "["; d; "] "; i2.Subject; " is already imported with categories "; i2.Categories
            Set i2 = add_category(i2, label, target)
            dDone(d) = i2.EntryID
            i2.UnRead = i.UnRead
            i2.Save
            i.Delete
        Else
            Dim UnRead As Boolean
            UnRead = i.UnRead
            Set i2 = i.Move(target)
            Debug.Print "moving ["; d; "] "; i2.Subject; Format(dDone.Count, " (0)")
            Set i2 = add_category(i2, label, target)
            dDone.Add d, i2.EntryID
            i2.UnRead = UnRead
            i2.Save
        End If

        import_mailitem = True
    Else
        Debug.Print "ignoring "; TypeName(i); i.Subject
    End If
End Function

Private Sub import_label(dDone As Scripting.Dictionary, dIMAP As Scripting.Dictionary, _
        folder As Outlook.MAPIFolder, label As String, sha1, _
target As Outlook.MAPIFolder, remote As Boolean)
    Dim i As Variant, items As Outlook.items, N As Integer
    Debug.Print "Importing mail items with label "; label;

    If date_range <> "" Or remote Then
        Dim condition As String
        If remote Then
            Debug.Print " not marked... ";
            condition = "[IMAP Status] = 'Unmarked'"
        End If
        If date_range <> "" Then
            Debug.Print " between "; Replace(date_range, "-", " and "); "... ";
            If condition <> "" Then condition = condition + " And "
            Dim dr
            dr = Split(date_range, "-")
            condition = condition + "[SentOn] >= '" & dr(0) & "' And [SentOn] < '" & dr(1) & "'"
        End If

        Set items = folder.items
        Debug.Print "got items... ";
        Set i = items.Find(condition)
        Debug.Print "searching ...";

        While Not i Is Nothing
            If import_mailitem(i, label, dDone, dIMAP, sha1, target) Then N = N + 1
            Set i = items.FindNext
            Debug.Print label; Format(N, " \#0 ");
        Wend
    Else
        Debug.Print "... "; folder.Name; folder.Parent.Name
        Set items = folder.items
        Debug.Print "got items... ";

        ' local messages get deleted immediately, so we cannot simply loop
        Dim ix
        ix = 1
        While ix <= items.Count
            If import_mailitem(folder.items(ix), label, dDone, dIMAP, sha1, target) Then
                N = N + 1
                If N Mod 100 = 0 Then Debug.Print N;
            Else
                ix = ix + 1
            End If
        Wend
    End If

    Debug.Print "done"

End Sub

Private Sub import_labels(root As Outlook.MAPIFolder, label As String, _
        dDone As Scripting.Dictionary, dIMAP As Scripting.Dictionary, _
        sha1, target As Outlook.MAPIFolder, remote As Boolean)
    Dim f As Variant, folder As Outlook.MAPIFolder
    For Each f In root.Folders
        Set folder = f
        import_labels folder, label + "/" + f.Name, dDone, dIMAP, sha1, target, remote
    Next

    ' We will import All Mail last as it will finally remove the mail.
    ' Also don't import Trash (as it would permanently delete e-mail)
    If label = "" Or label = "/[Gmail]" Or label = "/[Gmail]/All Mail" _
            Then Exit Sub

    import_label dDone, dIMAP, root, Mid(label, 2), sha1, target, remote
End Sub

4. Configure your script

There are a few configuration variables at the top of your script:

IMAP

This is the name of the Outlook root folder where your emails from GMail show up.

PST

This is the name of your target Outlook root folder where you want to migrate your emails to

trash_label

This is the name of your trash label.  If you are importing your Trash with all label information, this is the name of your created label under which you moved the contents of your trash to.  If you are importing your Trash as is, you can enter "Trash" or simply "".

date_range

You have the option to only migrate a portion of your mail, e.g. if you want to only import e-mail from 2011, you would set this to "1/1/2011-1/1/2012".  NOTE: your date format is locale specific, so you need to use your date order.  If you want to migrate all mail, set it to "".  If you are selecting a date range, you should NOT set an IMAP folder size limit.  Otherwise, Outlook may not see all the e-mails from the selected date range.

cache_folder

You might have already copied over your IMAP folders to local folders.  This option allows you to specify the location of the root of the copied folders.  For this script, these have to be a subfolder of your target PST.

NOTE: local migration has not been fully tested.

NOTE 2: do not use this if you have migrated e-mail with categories and now you want to combine these categories.  Instead, simply copy the categorized email into the PST target folder.  Duplicate emails will have their categories merged.  If you do this, pay attention that the Inbox and Trash labels are represented by separate folders, instead of categories.  You need to copy emails with these categories into the respective target sub-folders.

USE_BODY

This script uses the message header of an item to identify it.  I have found this to be unique and it persists across the migration.  If you want to also use the message body, set USE_BODY to 1.   This will slightly slow down the hashing, but it is worth it if you might have messages without message headers.

5. Save your script

Click on the save icon () to save your script.

6. Add the referenced libraries

The script uses two libraries that you have to add to the References.  Select Tools => References.  Look for, and checkmark:
  • mscorlib.dll
  • Microsoft Scripting Runtime
Now you are ready to start the migration...


Migrating GMail to Exchange (part 2) - Configure Outlook

Now that you have set up your GMail account, you need to set up Outlook for the migration.

1. Enable Macros

By default, Outlook will not allow you to run any or any unsigned macros.  To change this, Go to File => Options =>Trust Center => Trust Center Settings => Macro Settings.
 



You will need to set this at least to "Notifications for all macros".  I do not recommend setting "Enable all macros", as this will allow any macros to run.

You will need to restart Outlook to make this change effective.  For now, just quit Outlook.

2. Enable large Outlook PST and OST

Outlook keeps your local mail in a data file with a .pst extension.  By default, Outlook 2010 allows this file to grow to 50GB; however, it is likely that your IT department has severely limited this size.  It's better to set this based on your migration needs.

Similarly, Outlook caches mail that is kept on a server (such as Exchange or IMAP) in a cache file with .ost extension.

You will need to allow PST files to be at least the size of your GMail account (e.g. whatever GMail tells you at the bottom of the pages on the "Using x.y GB of your Z GB" line.)  I also recommend adding some slack, just in case.

You will need to allow OST files to be quite large if you decided to migrate all your mail in one shot, and not limiting your IMAP folder sizes.  This could be 18GB if each mail has an average of 2 labels (including Inbox, Starred and Important as potential labels).  However, conversion is much faster if the cache file is under 4GB, and even faster under 2GB.  I do not recommend shortening the max limit for OST files, just in case the size of the IMAP folders do get larger than the limit you would set.  This would result in lots of error messages when trying to update your cache.

To change the size limit of the PST and OST files, follow this KB article.  Basically, you will need to change entries using regedit/regedt32 (Start button => type "regedt32" => Press Enter => Say Yes to User Account Control message).


Browse to key: HKEY_CURRENT_USER\Software\Policies\Microsoft\Office\14.0\Outlook\PST or possibly HKEY_CURRENT_USER\Software\Microsoft\Office\14.0\Outlook\PST if it exists.
You have to set MaxLargeFileSize to the higher of your PST or OST estimated size limit. The values are understood in MB. You should set WarnLargeFileSize to be 5% below the Max value.


Make sure you remember your old settings in case you need to reset them later, e.g. for your IT to be happy.

3. Create migration target

I recommend migrating your old mail into a new Outlook data file.  This allows you to keep your mail together in a locally managed file, and allows you to continue using Outlook on your Exchange server on the web, or on a different PC.

To create a new Outlook data file, start Outlook, and select Home => New Items => More Items => Outlook Data File...


Use "Outlook Data File" for Save as type.



See your new file on the left side of your Outlook window, as one of the root folders.  Rename it to something unique that you will like.

4. Add your GMail account to Outlook

File => + Add Account

Follow the configuration steps in GMail to add access to your GMail account in outlook.  These were at the bottom of your POP/IMAP Settings page.

See your IMAP account on the left side of your Outlook window, as one of the root folders.  Rename it to something unique and easy to remember, as most likely the default name has been already taken by your Exchange account.

5. Configure Outlook

In order for your migration to be smooth, you will need to disable most automatic Outlook actions to your e-mails, otherwise, these will interfere with or during the migration process.

A. Mark items for deletion

[OPTIMIZATION] If you selected the Auto-expunge OFF optimization in GMail, you will need to also set up Outlook to mark items for deletion so that they can be purged in one step.  Go to File => Account Settings => Account Settings... . In the Account Settings dialog, select your IMAP account and click on Change....  Then click on More Settings... => Deleted Items (tab).





  • Select "Mark items for deletion but do not move them automatically".
  • Unselect "Purge items when switching folders while online."

B. Remove tracking automation

These have a tendency to automatically move/remove messages while being migrated, resulting in error messages such as "Item has been deleted" or "Item is missing".

Under File => Options => Mail, under Tracking header




  • Select "Never send a read receipt"
  • Deselect "Automatically process meeting requests and responses to meeting requests and polls"
  • Deselect "Automatically update original sent item with receipt information"
  • Deselect "Update tracking information, and then delete responses that don't contain comments"
  • Deselect "After updating tracking information, move receipt to:"

C. Disable Junk Mail filter

Click on a folder on your GMail account, then click Home => Junk => Junk E-mail Options... .
Select "No Automatic Filtering".



NOTE: these settings are per account, so you need to select a folder on your GMail account to configure the settings.

If you want, you can also disable Junk Mail globally (including blocked senders) set DisableAntiSpam to 1 in the registry under HKEY_CURRENT_USER\Software\Policies\Microsoft\Office\14.0\Outlook.  Create the path and/or keys if they do not exist.



For more info, see this post.

D. Disable Reminders

Outlook will create Reminders for all of your imported meetings, and show them as overdue.  Due to this, you may want to disable Reminders during the migration.

File => Options => Advanced, under Reminders header




  • Deselect "Show reminders:"

Next... add the conversion macro script to Outlook

Migrating GMail to Exchange (part 1) - Configure GMail

The first step of your migration process is to configure your GMail account.

1. IMAP Access

In order to see your messages in Outlook, you'll need to configure your GMail account first.  You can do this under  => Settings => Forwarding and POP/IMAP.


Select Enable IMAP.

OPTIMIZATION #1 (recommended)

In order to speed up the conversion process, it makes sense to decouple deletion of a message in Outlook's cache from the actual expunge of that e-mail.   This way, Outlook will mark messages for deletion first, and communicate them in one shot to the server at the end of the conversion. You do this by selecting "Auto-Expunge off".

Once you do this, "Archive the message" should be selected below.  You must do this so that all labels can be migrated over.  This allows to migrate each label individually.

OPTIMIZATION #2 (consider)

I've found that the conversion speeds up considerably, if the number of messages in a folder are small.  It also may be that Outlook uses more memory traversing large folders, slowing the processor, and the conversion process down.  This also may be related to the size of the Outlook cache file - where Outlook downloads the IMAP folders' contents.  Either way, one way to control the size of the cache file and the required resident memory is to limit the number of messages in each folder.

I measured about 15-30 seconds/item/label conversion speed when the folder size was unlimited.  This reduced to 1-5 seconds/item/label when limiting folder size at 1000.

NOTE: If you limit the folder size, you will need to do the conversion in multiple iterations:

  1. download messages (up to the limit) to local Outlook cache
  2. convert downloaded messages
  3. purge the messages on the GMail server, so new messages can be downloaded
NOTE: there is a link at the bottom of this page with instructions to set up your Outlook to access your GMail messages.

2. Labels

Control your Outlook cache size

Another way to control the size of the Outlook cache is to select which labels show up in the IMAP cache.  Under   => Settings => Labels, you can select for each label, whether or not to "Show in IMAP".  Here you can also see how many conversations are labeled with a certain label.  This allows you to gauge the size of the Outlook cache.

E.g. if you have 60000 conversations (this you know by checking the number when looking at all your mail.  It will say something like 1-50 of 60,000), you are using 6GB (this shows at the bottom of your GMail page), and you have a total of 60,000 conversations under all labels, you can estimate your total cache to be 12GB.  However, you also need to include the number of messages in your Inbox, Starred and Important, as these are not listed on the Labels page.

Exchange compatible label names

Outlook 2010 does not allow category names to start/end with space, or to contain comma (,) or semicolon (;).  Rename all of your labels that contain any of these letters.  Gmail labels already cannot start/end with space.

3. Decide what to do with your Trash

A. Do not migrate trash

The easiest option is to not migrate your trash.  Then, for best performance, go to your trash, and click "Empty Trash now".

B. Migrate trash with all labels

Messages in the trash retain their labels; however, these are not exported to the IMAP interface (deleted messages will not appear under their labeled directories).  If you want to migrate the messages in the trash with all their labels, you will need to add your own "trash" label.  You can do this by

1. Go to your trash

2. Select all mail on the scree on the checkbox

3. If you have more than 50 messages, a link will appear "Select all X conversations in Trash".  Click that link.

4. Move all mail in the trash under a new label (e.g. "T").  Make sure you move to a newly created label so that you can keep messages from your trash separately.

During conversion, you can designate this label to go to your Trash folder.

NOTE: you can use this trick to separate messages with certain labels in your trash to delete them permanently.  It is because now that you freed up your trash, you can selectively put only messages that you want to delete permanently into it.

NOTE: if you are migrating your trash with labels, this will increase the size of the Outlook cache similarly to having multiple labels on regular messages, as these messages will now appear in the folders of all of their labels.

C. Migrate trash without labels

If you want to migrate your trash without your labels, you can keep them where they are in GMail.  You will need to set your trash label to "Trash" during migration.

Migrating GMail to Exchange (part 0)

Update: I have now posted a quicker procedure to migrate your GMail to MS Exchange here.

Recently I took on the project of migrating from GMail to MS Exchange.  This is being an arduous process.  While I see no sane reason why to move from GMail with it universal accessibility and ease of interface to Exchange, sometimes there are external reasons.  E.g. your IT department decided after their Google Apps trial that it is not for them.

This series of posts will document a way to do this, while at least keeping all your GMail labels with your messages.  There may be other, easier ways, but a simple Google search did not yield a useable result.

You will need to have Outlook (part of MS Office) for this process.  Exchange (or Outlook at least) allows you to attach 'categories' to each item.  We will use these instead of GMail's labels.  Good news: I was able to translate all of my GMail filters to Outlook rules, albeit these have to run on the client (e.g. Outlook).  I'm sure you'll be able to do the same.

This guide is written for Outlook 2010, but should also work similarly for Outlook 2003 or 2007.  Note of caution: the conversion will take a LOT of time.  In my case it is taking 2.5 seconds/email/label.  You will not be able to use Outlook during the conversion. I recommend using Outlook Web Access if you have to access Exchange during the conversion, unless you have access to another machine.

We will use GMail's IMAP interface to access your old mail.  In this interface, there is a folder for each label, and a message will appear in ALL of the folders of its labels.  Outlook will see each of these copies as separate emails.  However, after the conversion, all of these copies will be consolidated into a single email with the labels represented as categories.

Limitation

Chat messages are not exported to IMAP, so they will not be migrated to Outlook.

Update: I have now posted a quicker procedure to migrate your GMail to MS Exchange here.