I have heard that Sharepoint could be the answer to a loosely managed, fast growing file share. I was asked to try and find a solution for a file share which currently houses millions of files, and terabytes of data using Sharepoint. I see a number of huge challenges with this exercise if we use Sharepoint, and causing many questions.
Should we import the documents into the database, should we use RBS, should the files stay on the file share and just import metadata or references? How would that happen? What type of storage, how would you set up the content databases? Would you use multiple drives? How will this effect backups? Also what is the best way to migrate 10s of millions of files?
We would probably want to eliminate duplicates, keep versions, and store meta-data for easy searching.
Does anyone have any experience, ideas or suggestions regarding this type of process being managed in Sharepoint? Are there any third party applications that make sense? Has anyone actually done something like this before.
Thanks in advance for your help.
Vikki McCormick
Sort of. I have never worked with a Search app before. There are over 20 million files and more than 5 terabytes of data. If I test on a smaller scale it might not give me the same results as when I am fully implemented. I will have to put in extra time to figure out how it works best. So my next step should be to set up a Search application and try it out. I hear it is best to set up all your metadata first. Is that true? If so, any recommendations?
some further thoughts:
http://www.cloudsearchportal.com/are-you-considering-upgrading-to-sharepoint-2013-search/
I don’t think you’re missing anything, Vikki. In my experience, this is a common request: move everything from the fileshare to SharePoint.
You’re right, there needs to be some justification (up front) for the effort.
If search is the goal, then crawling the fileshare from SharePoint makes sense as a solution. SharePoint search is quite fast and easy to setup. However, with so many documents, you will need a good amount of server horsepower to index all of those docs.
Are you unconvinced about the benefits of SharePoint search?
Hi Phil,
Thank you for your response. I don’t think this is a lone situation when it comes to managing archive files that need to be kept around for reference, and what can be done with them. I would think of these files like a library or a history.
I guess we are looking at number 3. I really don’t want to migrate all those documents into Sharepoint at all. I know if we went through an exercise like this and wanted to spend time and money on it, it would have to be an improvement over what we have which is more of a manual search.
So what I don’t get is how Sharepoint would improve this process. I’m missing something.
Does it search a file share faster? Do you get to add some extra or customized metadata to assist your search and to help categorize the information? I only see a couple of ways of doing this.
A custom web-part or third party web-part/app that connects a fileshare to Sharepoint. Importing the documents (For me not an option). Crawling the FileShare.
What would offer the greatest benefit in managing files like this? Is there a benefit to doing this in Sharepoint? I just feel like I am missing something.
Thanks again.
Vikki
I think you need to ask the “powers that be” one question – why?
1) Does the organisation need to collaborate around millions of files with version control, checkout, approvals, workflows etc.?
2) Is the aim to get a handle on permissions and push the burden of permissions management off IT’s plate and onto the business units (end users)?
3) Is it a matter of search?
If the answer to question #1 is “yes”, then you need to figure out whether all of that data really needs to migrate to SharePoint. Unless the organisation is home to tens of thousands of users, much of that documentation is bound to be historical/archival. What are the advantages to having that material living in SharePoint?
If the answer to question #2 is “yes”, then I would be wary of the consequences here. Do end users want the responsibility of managing permissions on their own? Can they handle the responsibility (both from a technical and workload perspective)? What about compliance?
If the answer to #3 is “yes”, then you can simply counter by crawling the existing fileshare with SharePoint. Might need to build up/out your farm a bit to handle the load, but it’s definitely doable.
Certainly, there are tools to migrate all of these documents into SharePoint. But why? There was a time when it made no sense to undertake such an operation because everything was stored in the content databases, but that time has passed. With RBS and shredded storage in 2013, this endeavour is completely possible.
Again, why?
Note that there are also tools out there for simply “connecting” fileshare documents to SharePoint without migrating them. Look at DocAve Connector, for example.
The “fancy fileshare” solution is the oldest trick in the book for managers hungry to dive into the SharePoint world. Hopefully this isn’t an “adoption” mission!